Tuesday, 2020-04-07

fungithanks!00:03
*** mlavalle has quit IRC00:09
johnsomAny advice about what I should do when I get "failed to push some refs"?00:11
johnsomhttps://www.irccloud.com/pastebin/4qFPQk1Z/00:11
ianwjohnsom: in what context?  first thought is that you're trying to update a change that doesn't need updating?00:13
clarkbjohnsom: you aren't pushing a merge commit are you?00:13
johnsomI was doing the normal "git review" after fixing a requirements.txt00:13
johnsomThere is a screen and a half of spinning lines above it. I can paste it all if you would like00:14
openstackgerritIan Wienand proposed zuul/zuul-jobs master: Add bionic-plain non-voting testing  https://review.opendev.org/71765500:14
openstackgerritIan Wienand proposed zuul/zuul-jobs master: test jobs: fixup autogeneration header  https://review.opendev.org/71765600:14
openstackgerritIan Wienand proposed zuul/zuul-jobs master: [wip] ensure pip role  https://review.opendev.org/71763900:14
johnsomhttps://www.irccloud.com/pastebin/chJB2kXw/00:15
ianw"internal error" there looks ... interesting00:15
johnsomFull thing00:15
ianwbecause ^^ i am also pushing a stack right now that seems to have stopped00:15
johnsomgerrit bot announced it. I'm just ... concerned00:15
clarkbjohnsom: gerrit bot announced it meaning the chagne was pushed? have a link to it?00:16
johnsomhttps://review.opendev.org/71137600:16
fungipossible that happened right at utc midnight?00:16
johnsomThe two files I changed reflect my change00:16
clarkbjohnsom: thats a lot more than just a requirements update00:16
mnaserzuul-preview has been updated to fix the security issue that it had.   is there a way to know if/when it has been redeployed?00:17
clarkbjohnsom: can you use more words to express that? the change has a lot of things in it. much more than 2 files00:17
johnsomSorry, my revision, patch set 8, only updated the requirements.txt and lower-constraints.txt files.00:18
johnsomBumped both to octavia-lib 2.0.000:18
clarkbjohnsom: ok, from gerrits and gits perspective you push the entire patch each time00:18
ianw ! [remote rejected] HEAD -> refs/for/master%topic=ensure-pip (internal error)00:18
ianwi just got the same thing00:18
fungiinfra-root: per mnaser's question, should it be safe to boot zuul-preview.o.o back up (it's presently in shutdown state) and then immediately disable apache until configuration management gets applied?00:18
clarkb[2020-04-07 00:18:20,275] [SSH git-receive-pack '/zuul/zuul-jobs.git' (iwienand)] WARN  com.google.gerrit.server.git.AsyncReceiveCommits : Error in ReceiveCommits while processing changes for project zuul/zuul-jobs00:19
clarkbianw: ^ was that you?00:19
clarkboh ya its got your username in it00:19
ianwvery likely :)00:19
clarkbWARN  com.google.gerrit.server.git.MultiProgressMonitor : MultiProgressMonitor worker killed after 240205ms(timeout 205ms, cancelled)00:19
fungi"Error in ReceiveCommits while processing changes for project" sounds like we could have a problem with the local repositories, i suppose00:19
clarkbfungi: it seems like its taking too long so gerrit is stopping it00:20
clarkbmelody doesn't show garbage collecting which is a commonish reason for things being slow00:21
fungiright, maybe we've got something else timing out... db connections?00:21
ianwthis all has my local timestamp of 10:14 ... same as openstackgerrit's comments above https://review.opendev.org/#/q/topic:ensure-pip+(status:open+OR+status:merged)00:21
clarkbianw: was that all of your commits or only some?00:21
ianwhowever, openstackgerrit only mentioned 3 of the commits00:21
clarkbbut ya I'm wondering if this is related to db backups00:21
mnaserfwiw on sunday we observed some weird behaviour with gearman disconnecting from zuul.  i don't think this is related, but it may signal *something*00:21
fungilooks like mysqldump ran from 00:00 to 00:11z00:21
ianwthat stack was rebased ontop of the recent renames in zuul-jobs, so all should have been updated00:22
clarkbstackalytics bot is doing a ton of queries right now too00:22
clarkbianw: johnsom octavia and zuul-jobs are the only two things that show "Error in ReceiveCommits while processing changes for project"00:23
clarkbmy hunch is that db backup and/or stacklytics queries impacting performance of pushes on the short term00:23
clarkbpossibly those two things happenign together00:24
fungi240205ms is just over 4 minutes, so that still doesn't put initialization of the call inside the mysqldump window00:24
clarkbfungi: could be request backlog due to db locks though00:24
* clarkb pushes a change to see if it is still happening00:25
openstackgerritIan Wienand proposed zuul/zuul-jobs master: Add bionic-plain non-voting testing  https://review.opendev.org/71765500:25
openstackgerritIan Wienand proposed zuul/zuul-jobs master: test jobs: fixup autogeneration header  https://review.opendev.org/71765600:25
openstackgerritIan Wienand proposed zuul/zuul-jobs master: fetch-zuul-cloner: prefer venv to virtualenv  https://review.opendev.org/71788200:25
ianwinteresting ... i rebased on a wording change00:25
ianwit appears to have hung again00:25
fungicacti doesn't show any significant spikes in activity00:26
clarkbopendev/infra-manual change I just pushed up was fine00:26
ianwread(3, "a\325\216\32\255a\235W\343\251\245,O7\10\5\27\33\375H\r\215\222\377\31\3753\205\322*\345\351"..., 8192) = 11200:26
ianwselect(7, [3], [5], NULL, {tv_sec=300, tv_usec=0}) = 1 (out [5], left {tv_sec=299, tv_usec=999933})00:26
ianwwrite(5, "0031\2\rProcessing changes: update"..., 49) = 4900:26
clarkbhowever it was based on HEAD and one line00:26
ianwis what it's doing over and over00:26
ianw(the ssh connection)00:26
openstackgerritIan Wienand proposed zuul/zuul-jobs master: [wip] ensure pip role  https://review.opendev.org/71763900:27
ianw /usr/bin/ssh -p 29418 iwienand@review.opendev.org git-receive-pack '/zuul/zuul-jobs.git'00:27
openstackgerritIan Wienand proposed zuul/zuul-jobs master: [wip] ensure-tox: preinstall pip  https://review.opendev.org/71766300:28
clarkbianw: no new errors in the log yet at least (and it looks like they are coming through slowly00:28
clarkbso ya possibly related to that repo?00:28
fungithough johnsom's was for a different repo?00:28
openstackgerritIan Wienand proposed zuul/zuul-jobs master: Update Fedora to 31  https://review.opendev.org/71765700:28
clarkbfungi: yes for octavia00:29
johnsomopenstack/octavia00:29
* clarkb tries octavia00:29
ianwyeah, last push openstackgerrit didn't comment on 3 of 600:29
ianwthis time it did, slowly00:29
johnsomLet me know if you want me to try something, I'm just hanging out and writing a google doc00:29
fungifwiw, we have seen patchsets "go missing" when pushed during the mysqldump window, because gerrit gives up waiting for the db write to complete and so never records the push00:30
clarkbI can push to octavia with no slowness or problems00:30
ianwfungi: but the one above at 10:25 was well outside the window?00:30
clarkbianw's recent push just tripped the error log string that I was checking for00:30
clarkbmy ability to push to octavia wihtout issue makes me think it isn't a repo problem. Could be construction of the change series?00:31
fungiianw: is it for the same change or a different change?00:31
clarkbianw: do you mind if I fetch your stack and trivially update it and push that?00:31
clarkbianw: actually now I don't know which change to update. https://review.opendev.org/#/c/717657/5 that one maybe?00:32
ianwclarkb: umm, the bottom one is ...00:33
clarkbianw: the bionic update one?00:34
ianwoh, the top one would be better00:34
clarkbianw: ya but I worry it won't exercise things as much since it just one change being pushed00:34
clarkbbut thats why I am asking for permission before I invalidate test results :)00:34
ianwgit review -d 71765700:34
ianwthat should be all of it00:34
clarkbianw: ya I've pulled that change down and updated the one 6 ahead of it00:35
clarkbyou ok with my pushing new patchsets up for all 6?00:35
ianwyep00:35
openstackgerritClark Boylan proposed zuul/zuul-jobs master: Add bionic-plain non-voting testing  https://review.opendev.org/71765500:35
openstackgerritClark Boylan proposed zuul/zuul-jobs master: test jobs: fixup autogeneration header  https://review.opendev.org/71765600:35
openstackgerritClark Boylan proposed zuul/zuul-jobs master: fetch-zuul-cloner: prefer venv to virtualenv  https://review.opendev.org/71788200:35
clarkbseems to be doing it to me too00:35
clarkbthats good I guess?00:35
clarkbjohnsom: was your change in a stack of changes?00:36
clarkbok so gerrit knwos about my newly pushed patchests but git review doesn't return yet00:36
johnsomIt was the first one with two dependent on top00:36
clarkband gerritbot doesn't say anything yet00:36
ianwit *has* seemed to update them all though?  they all have the same timestamp now00:37
openstackgerritClark Boylan proposed zuul/zuul-jobs master: [wip] ensure pip role  https://review.opendev.org/71763900:37
clarkbianw: yup00:37
clarkbbut my git review has not returned00:37
openstackgerritClark Boylan proposed zuul/zuul-jobs master: [wip] ensure-tox: preinstall pip  https://review.opendev.org/71766300:38
clarkbjohnsom: ianw: in ianws stack the first three seem to be processed quickly and don't add any new files00:38
clarkbthe fourth change is slower and does add new files. I notice that johnsom's change also adds files00:38
clarkbpotentially somethign to do with the contruction of the change itself?00:39
openstackgerritClark Boylan proposed zuul/zuul-jobs master: Update Fedora to 31  https://review.opendev.org/71765700:39
clarkbthere is that second mysql database that gerrit uses that tracks files iirc00:39
clarkbperhaps it could be related to that (that db being slow to update possibly?)00:39
fungithe h2 db?00:41
clarkbhttps://review.opendev.org/717887 was just pushed up and adds a new file. That was not slow at all00:41
clarkbfungi: its not an h2 db, its in mysql00:41
clarkbhttps://review.opendev.org/717887 probably rules out the new file being the cause00:41
ianw[2020-04-07 00:36:20,564] [ReceiveCommits-16] ERROR com.googlesource.gerrit.plugins.its.storyboard.StoryboardClient : Failed to post, response: 500 (Internal Server Error)00:41
fungihrm... accountPatchReviewDb?00:41
ianwmaybe something to do with that ... is that the first one that has a Story: #00:42
clarkbianw: oh interesting00:42
clarkbfungi: ya that sounds correct00:42
clarkbfwiw `gerrit show-queue -w` doesn't show any oustanding queue items00:42
fungioh, could this the the its-storyboard plugin misbehaving somehow?00:42
clarkbya I'm betting the storyboard thread is the one to pull on00:42
johnsomMy patch has a story00:43
clarkbI need to go help make dinner but can check in in a bit00:43
ianwlogging in to see what that 500 might be about00:43
fungistoryboard api still seems responsive00:44
ianw[Tue Apr 07 00:40:00.314321 2020] [wsgi:error] [pid 9734:tid 140188982429440] 2020-04-07 00:40:00.313 9734 ERROR wsme.api [-] Server-side error: "This session is in 'inactive' state, due to the SQL transaction being rolled back; no further SQL can be emitted00:44
ianw within this transaction.". Detail:00:44
ianwhang on, it was 00:3600:44
ianwsame msg @ Tue Apr 07 00:36:20.561390 202000:45
ianwhttp://paste.openstack.org/show/791710/00:45
fungimysqld is running00:46
fungi[Mon Apr  6 23:11:25 2020] Out of memory: Kill process 4923 (apache2) score 745 or sacrifice child00:46
fungii suspect this is related00:47
ianwoh great, one of those errors where the only hits you get is from sites indexing the code that is throwing it00:47
fungiianw: i had one last week where i searched for an obscure error in some (not ours) software and the only hit was an irc log where we were scratching our heads about it 5 years ago00:47
fungii'm going to restart services on storyboard.o.o00:48
ianwit would explain the behaviour though of gerrit seeming to get all the changes, but follow-up processing being weird00:49
fungithe server is currently early 4gb into swap00:49
fungilooks like mysql is eating 4gb and an apache worker has another 4gb00:49
fungimay actually just reboot the whole server to make sure everything comes up clean and i don't miss anything00:49
fungiit's rebooting now00:50
fungi#status log rebooted storyboard.openstack.org following a slew of out-of-memory conditions which killed various processes00:51
openstackstatusfungi: finished logging00:51
fungiserver's back up00:51
ianwi'll try the stack again00:52
openstackgerritIan Wienand proposed zuul/zuul-jobs master: Add bionic-plain non-voting testing  https://review.opendev.org/71765500:52
openstackgerritIan Wienand proposed zuul/zuul-jobs master: test jobs: fixup autogeneration header  https://review.opendev.org/71765600:52
openstackgerritIan Wienand proposed zuul/zuul-jobs master: fetch-zuul-cloner: prefer venv to virtualenv  https://review.opendev.org/71788200:52
openstackgerritIan Wienand proposed zuul/zuul-jobs master: [wip] ensure pip role  https://review.opendev.org/71763900:52
openstackgerritIan Wienand proposed zuul/zuul-jobs master: [wip] ensure-tox: preinstall pip  https://review.opendev.org/71766300:52
fungiyeah, looks like it was thrashing so hard even snmp queries were going unanswered00:52
openstackgerritIan Wienand proposed zuul/zuul-jobs master: Update Fedora to 31  https://review.opendev.org/71765700:52
ianwyay \o/00:52
ianwmystery solved i guess00:53
fungihttp://cacti.openstack.org/cacti/graph.php?action=view&local_graph_id=66356&rra_id=all00:53
fungiyikes00:53
fungilooks like things just started going nuts around 20:0000:53
johnsomo/ I'm not too worried that storyboard didn't get update for this patch, so I'm off to make dinner. Thanks folks00:55
fungithanks for bringing it to our attention johnsom!00:55
fungii suspect the pattern we're seeing on that graph is something causing an apache worker to eat all the memory, then it gets killed, then another apache worker does the same, over and over00:56
fungisince the storyboard api is running in mod_wsgi though that could be pretty much any api interaction at fault00:57
fungithe first oom in the syslog starts at 20:27:0001:03
fungias expected, it reached 100% utilization of all 8gb ram and 8gb swap01:04
fungithe pid killed that time was an apache2 process with 3438675 total_vm (1839417 rss, 1103106 swapents)01:07
openstackgerritIan Wienand proposed zuul/zuul-jobs master: fetch-zuul-cloner: prefer venv to virtualenv  https://review.opendev.org/71788201:08
openstackgerritIan Wienand proposed zuul/zuul-jobs master: [wip] ensure pip role  https://review.opendev.org/71763901:08
openstackgerritIan Wienand proposed zuul/zuul-jobs master: [wip] ensure-tox: preinstall pip  https://review.opendev.org/71766301:08
openstackgerritIan Wienand proposed zuul/zuul-jobs master: Update Fedora to 31  https://review.opendev.org/71765701:08
openstackgerritIan Wienand proposed zuul/zuul-jobs master: fetch-zuul-cloner: prefer venv to virtualenv  https://review.opendev.org/71788201:15
funginot seeing any unusual traffic jumping out at me from the access logs01:15
openstackgerritIan Wienand proposed zuul/zuul-jobs master: [wip] ensure pip role  https://review.opendev.org/71763901:15
openstackgerritIan Wienand proposed zuul/zuul-jobs master: [wip] ensure-tox: preinstall pip  https://review.opendev.org/71766301:15
openstackgerritIan Wienand proposed zuul/zuul-jobs master: Update Fedora to 31  https://review.opendev.org/71765701:15
fungialso those wsgi "IOError: failed to write data" messages are probably unrelated, we get a handful of those every day in the logs for as far back as our log retention goes01:18
fungithe user-facing impact (gerrit et ceters) likely didn't start until 23:49:39z when the first sqlalchemy traceback appears in the apache error log01:20
fungi[Mon Apr 06 23:49:39.702752 2020] [wsgi:error] [pid 9734:tid 140188890109696] [client 216.54.31.86:5548] DBError: Can't reconnect until invalid transaction is rolled back, referer: https://storyboard.openstack.org/01:21
fungianyway, not sure what else to look at for the moment, other than to keep an eye on the cacti graphs for the server and see if we get any recurrence01:22
*** jentoio has joined #opendev01:26
*** ysandeep|away is now known as ysandeep|rover02:02
openstackgerritIan Wienand proposed zuul/zuul-jobs master: fetch-zuul-cloner: prefer venv to virtualenv  https://review.opendev.org/71788202:59
openstackgerritIan Wienand proposed zuul/zuul-jobs master: [wip] ensure pip role  https://review.opendev.org/71763902:59
openstackgerritIan Wienand proposed zuul/zuul-jobs master: [wip] ensure-tox: preinstall pip  https://review.opendev.org/71766303:00
openstackgerritIan Wienand proposed zuul/zuul-jobs master: Update Fedora to 31  https://review.opendev.org/71765703:00
*** prometheanfire has quit IRC03:11
*** prometheanfire has joined #opendev03:12
kevinzianw: talk here:-)  https://a1f557d8e7f24035a796-8b5cc018a02fed487ca2a6d59a4ee9d5.ssl.cf1.rackcdn.com/716839/1/check-arm64/dib-functests-arm64-bionic/129eb08/logs/ubuntu-minimal_bionic-arm64-build-succeeds.FAIL.log03:12
kevinzConnection failed [IP: 139.178.85.143 80]03:12
*** ysandeep|rover is now known as ysandeep|afk03:13
kevinzianw: I remember it has happened before in Linaro London03:13
ianwkevinz: oh yes sorry frickler mentioned that, i meant to follow up03:13
ianwthat's the mirror?03:14
kevinzianw: no worries, yes that is the mirror03:14
ianwyeah ... i guess it doesn't have AAAA records?03:14
kevinzianw: i can access it locally03:14
prometheanfireanyone able to take a min to review this glean fix https://review.opendev.org/71733903:15
prometheanfirefixes some resolved functionality03:15
ianwkevinz: hrm, it does have AAAA records so i wonder why we try ipv4?03:16
kevinzianw: Aha, I thinks it may due to the settings. I change the setting in DNS to use host dns config03:17
kevinzianw: I change this because Kolla need this IPv4 dns for docker building, as IPv6 for some apt/yum resources are not stable.03:18
kevinzianw: let me think how to fix this. I can set os-jobs subnet dns to host dns server, and leave os-controls subnet dns as default.03:20
*** ysandeep|afk is now known as ysandeep|rover04:18
*** ykarel|away is now known as ykarel04:20
openstackgerritIan Wienand proposed zuul/zuul-jobs master: [wip] ensure pip role  https://review.opendev.org/71763904:21
openstackgerritIan Wienand proposed zuul/zuul-jobs master: fetch-zuul-cloner: prefer venv to virtualenv  https://review.opendev.org/71788204:21
openstackgerritIan Wienand proposed zuul/zuul-jobs master: [wip] ensure-tox: preinstall pip  https://review.opendev.org/71766304:21
openstackgerritIan Wienand proposed zuul/zuul-jobs master: Update Fedora to 31  https://review.opendev.org/71765704:22
prometheanfireianw: thanks for the review, unfortunately this bug was turned up in a more basic gentoo install, the previous stage4 wrote a file (with google dns servers) to /etc/resolv.conf, hiding the error, stage3 doesn't have that file04:26
prometheanfireit seems like what we need is...  if file is exists and is not symlink, write file.  if file exists and is symlink (not dangling), write file.  if file does not exist and is symlink, do NOT write file (resolved).   if file does not exist and is not symlink, write file (unconfigured system)04:29
prometheanfireianw: https://review.opendev.org/#/c/257173/ could work I guess, was hoping to avoid that04:35
prometheanfirethe problem with it is that it disables resolved04:36
*** prometheanfire has quit IRC04:40
*** prometheanfire has joined #opendev04:40
ianwi ... don't quite know :)  perhaps glean can not decide where the file should point to at all?  it should all be up to the image to provide the right thing?04:44
prometheanfireianw: if only systemd sucked a little less :P04:46
fungiyeah, problem is systemd's "resolved" eventually writes out a file with dns info, i guess, but until it does /etc/resolv.conf is a dangling symlink?04:46
prometheanfireya, that's basically it04:47
fungibut if glean writes to the symlink target then resolved will decide it shouldn't touch it?04:47
prometheanfireno, the target's directory does not exist04:47
ianwmore if that's a symlink, then dhclient/networkmanager won't write to it IIRC04:47
prometheanfire/run/systemd/resolved04:47
prometheanfiretmpdir that's created when resolved starts, after glean has run04:48
fungii see04:48
ianwbut also, for like RAX where we have to write out the nameservers from configdrive, i think there's chicken/egg?04:48
prometheanfirenot sure about that, I use config drive at home, so maybe I'm experiencing the same thing04:49
prometheanfireI thought glean had to use config drive04:49
ianwyes it doesn't do http meta-data reading, but it's more about if the environment supports dhcp or not04:50
prometheanfiresure04:52
openstackgerritMatthew Thode proposed openstack/diskimage-builder master: use stage3 instead of stage4 for gentoo builds  https://review.opendev.org/71717704:52
prometheanfirewell, included that, won't help the gcc build time, but should help it look like it boots04:53
openstackgerritAndreas Jaeger proposed zuul/zuul-jobs master: Update registry test to use ensure-podman and ensure-docker  https://review.opendev.org/71675204:53
openstackgerritIan Wienand proposed zuul/zuul-jobs master: [wip] ensure pip role  https://review.opendev.org/71763904:53
openstackgerritIan Wienand proposed zuul/zuul-jobs master: fetch-zuul-cloner: prefer venv to virtualenv  https://review.opendev.org/71788204:53
openstackgerritIan Wienand proposed zuul/zuul-jobs master: [wip] ensure-tox: preinstall pip  https://review.opendev.org/71766304:53
openstackgerritIan Wienand proposed zuul/zuul-jobs master: Update Fedora to 31  https://review.opendev.org/71765704:53
prometheanfirenow, if only I could get hurricane electric to stop sending me duplicate packets04:54
openstackgerritAndreas Jaeger proposed openstack/project-config master: Retire OSA repo_build and pip_install projects  https://review.opendev.org/71777504:55
prometheanfirewell, gonna go offline, if you have a reply, I guess reply in a bug (should be back in 5-10 min)04:55
ianwi'm quite willing to believe that glean is either doing the wrong thing, or can not make the right decision in this case, and the base image should configure /etc/resolv.conf in such a way that if glean writes to it it "does the right thing"04:57
*** prometheanfire has quit IRC05:01
*** prometheanfire has joined #opendev05:12
prometheanfirewell, he is sucking hard eggs05:12
openstackgerritMerged zuul/zuul-jobs master: Update registry test to use ensure-podman and ensure-docker  https://review.opendev.org/71675205:14
*** ralonsoh has joined #opendev06:02
*** dpawlik has joined #opendev06:30
*** rpittau|afk is now known as rpittau06:56
*** DSpider has joined #opendev07:06
*** sgw has quit IRC07:11
AJaegerconfig-core, please review https://review.opendev.org/717715 https://review.opendev.org/717655 https://review.opendev.org/71771307:19
*** tosky has joined #opendev07:24
*** ysandeep|rover is now known as ysandeep|lunch07:44
fricklerAJaeger: I wouldn't merge 717655 on its own, rather wait for the stack to be a bit more complete08:02
*** ralonsoh has quit IRC08:05
openstackgerritMerged openstack/project-config master: Retire repo_build and pip_install roles  https://review.opendev.org/71771508:07
openstackgerritMerged zuul/zuul-jobs master: docs: fix a typo in `run-test-command`  https://review.opendev.org/71771308:15
*** ralonsoh has joined #opendev08:17
openstackgerritmathieu bultel proposed openstack/project-config master: Add validations-common and validations-libs to pypi  https://review.opendev.org/71797608:26
*** ysandeep|lunch is now known as ysandeep|rover08:44
*** ykarel is now known as ykarel|lunch08:48
*** roman_g has joined #opendev09:02
openstackgerritTobias Henkel proposed zuul/zuul-jobs master: Support ssh-enabled windows hosts in add-build-sshkey  https://review.opendev.org/65371209:12
*** ykarel|lunch is now known as ykarel09:41
*** ykarel is now known as ykarel|meeting10:02
*** rpittau is now known as rpittau|bbl10:25
*** ykarel|meeting is now known as ykarel10:37
openstackgerritSorin Sbarnea proposed zuul/zuul-jobs master: tox: allow tox to be upgraded  https://review.opendev.org/69005711:24
*** ysandeep|rover is now known as ysandeep|afk11:29
*** hashar has joined #opendev11:31
*** hashar has quit IRC11:42
*** ysandeep|afk is now known as ysandeep|rover11:56
*** rpittau|bbl is now known as rpittau12:06
openstackgerritSorin Sbarnea proposed zuul/zuul-jobs master: tox: allow tox to be upgraded  https://review.opendev.org/69005712:18
*** hashar has joined #opendev12:31
mnaserinfra-root: can we enable zuul-preview again?  The security issue has been patched now12:44
*** sgw has joined #opendev13:12
*** roman_g has quit IRC13:21
mordredmnaser: yeah - probably so. I'm slow this morning so I wanna wait for corvus to be up and verify he's happy13:49
mnasermordred: ok cool :)13:52
*** mlavalle has joined #opendev13:58
openstackgerritMerged zuul/zuul-jobs master: Add bionic-plain non-voting testing  https://review.opendev.org/71765514:14
corvusmnaser, mordred: yep, looks like the image is published; i'll remove zp01 from the emergency file and manually start it back up14:17
corvus#status log removed zp01 from emergency file since open-proxy issue is resolved in container image14:17
openstackstatuscorvus: finished logging14:18
mordredcorvus: woot14:19
corvusthe, ah, host doesn't appear responsive14:19
corvusit was in 'shutoff' state; i'm restarting it14:24
corvusmnaser: should be up and running the updated image now14:25
*** ysandeep|rover is now known as ysandeep|away14:33
openstackgerritSorin Sbarnea proposed zuul/zuul-jobs master: tox: allow tox to be upgraded  https://review.opendev.org/69005714:34
*** pramchan has joined #opendev14:38
openstackgerritMonty Taylor proposed opendev/system-config master: Make a new dockerized etherpad.opendev.org  https://review.opendev.org/71644214:38
mnasercorvus: cool, thanks, i'll try it out in a bit :)14:42
openstackgerritSorin Sbarnea proposed zuul/zuul-jobs master: tox: allow tox to be upgraded  https://review.opendev.org/69005715:11
johnsomclarkb Morning, I assume you are done with https://review.opendev.org/717885?15:25
clarkbjohnsom: yes, I'll abandon it15:26
johnsomThank you. Doing our freeze week review scrubs. grin15:26
openstackgerritSorin Sbarnea proposed zuul/zuul-jobs master: tox: allow tox to be upgraded  https://review.opendev.org/69005715:45
fricklercorvus: ffmuc folks have built updated jitsi apps, I tested .deb and x64.appimage and they seem to work fine against meetpad for me https://github.com/freifunkMUC/jitsi-meet-electron/releases/tag/v2.0.515:55
AJaegerconfig-core, infra-root, please review https://review.opendev.org/717814 https://review.opendev.org/717815 https://review.opendev.org/717813 https://review.opendev.org/718124 https://review.opendev.org/717822  - this is all changing install-* to ensure-* roles15:57
AJaegerconfig-core, and one more review, please : https://review.opendev.org/717976 - adding pypi jobs for two new repos16:00
*** dpawlik has quit IRC16:00
openstackgerritMerged zuul/zuul-jobs master: Add support for RedHat platforms on ensure-podman  https://review.opendev.org/71657816:03
openstackgerritClark Boylan proposed opendev/system-config master: Improving logging of prod playbook jobs  https://review.opendev.org/71816116:05
clarkbmordred: ^ what do you think of that?16:05
clarkbAJaeger: looking16:05
mordredclarkb: I think that's great16:07
*** rpittau is now known as rpittau|afk16:08
corvusfrickler: cool, i think we're probably going to update meetpad to point at etherpad.opendev.org when that's finished and then see how things are working16:09
clarkbfrickler: corvus are we thinking the app might be more stable than web browser?16:09
clarkbor just a useful tool to add to the toolchest?16:09
corvusclarkb: oh, i wasn't thinking that; i thought that was just an 'fyi' from frickler saying he tested it16:09
clarkbah16:10
*** ykarel is now known as ykarel|away16:10
openstackgerritMonty Taylor proposed opendev/system-config master: Make a new dockerized etherpad.opendev.org  https://review.opendev.org/71644216:11
mordredspeaking of - maybe that'll be green this time16:11
openstackgerritMohammed Naser proposed zuul/zuul-jobs master: fetch-javascript-output: add metadata.type  https://review.opendev.org/71816316:11
openstackgerritMerged openstack/project-config master: Add validations-common and validations-libs to pypi  https://review.opendev.org/71797616:19
fungialso looks like we might ought to try running tip of the develop branch for now (like etherpad-dev) until 1.8.1 releases: https://github.com/ether/etherpad-lite/issues/378116:21
fungiwhich is probably any day now16:21
clarkbfungi: I think we are going to use their upstream docker image16:22
fungi(which might end up being released as 1.8.3, i'm unclear on what the decision is there)16:22
fungioh, in that case i guess their images are build from the tagged releases?16:22
fungier, built16:22
clarkbyes appears that way16:22
openstackgerritMerged openstack/project-config master: Use ensure-* roles  https://review.opendev.org/71781316:23
openstackgerritMerged opendev/base-jobs master: Use ensure-* roles  https://review.opendev.org/71781516:26
fungiin that case, i guess we should just be on the lookout for a new etherpad image for the upcoming point release16:26
*** lpetrut has joined #opendev16:27
mordredyeah - and bump the version in the compose file16:30
mordredthey don't seem to release a 1.8 image tag - just the full point release16:30
AJaegerclarkb: thx for reviews16:31
openstackgerritMerged zuul/zuul-jobs master: fetch-javascript-output: add metadata.type  https://review.opendev.org/71816316:36
*** pramchan has quit IRC16:49
mordredclarkb: https://review.opendev.org/#/c/716442/ is green (at least the run-etherpad job is) so is ready for re-review when you get a sec16:56
*** lpetrut has quit IRC16:56
mordredclarkb: I've also got a mysqldump of production I'll apply to the local db once it's up and running so we can test things16:56
mordredclarkb: I figure rollout process will be turn off old, swap dns, dump db, restore dump on new server - will have an effective downtime of the dns ttl for folks16:57
fungiprobably worth announcing, not only because it will be down briefly, but also in case folks notice anything broken with older pads16:58
mordredyeah16:58
fungii did notice one of clarkb's test pads on etherpad-dev stopped working normally after upgrading16:58
mordredalso - we should probably poke at the new install for at least a few days before we announce a cutover16:59
clarkbfungi: the clarkb-test pad was broken a long time ago16:59
fungithis was clarkb-test2, but we had tortured that one fairly severely and it was around for years, so not entirely surprising16:59
funginevermind, it's working again now17:00
fungiit was failing to load with the new colibris theme17:00
fungireverting to the no-skin theme seems to have got it back to normal17:01
fungiso maybe a bug in the theme itself17:01
clarkbwouldn't surprise me if formatting of pads is affected in bad ways by new themes17:01
fungiyeah, me either17:02
fungitheming support in it is relatively new anyway17:02
mnaserif there’s a change in an image that opendev consumes, does it automatically update it every from run?17:03
clarkbmordred: I'm still not sure I understand the utf8 thing becuse in the mysqld portion of the config you haven't set it to utf8mb4 https://review.opendev.org/#/c/716442/19/playbooks/roles/etherpad/files/my.cnf17:03
mnaser(zuul preview change merged and just wondering if we have to bump17:04
mnasersomething somewhere)17:04
clarkbmnaser: yes, I believe all of our images auto update17:04
mordredclarkb: right. that's specifying the type on the server side of the connection. I'm setting it there because that's what's set on our current trove instance17:09
mordredclarkb: separately, the schemas and tables have their own character sets17:09
clarkbmordred: so that doesn't affect the default encoding when you createa a new table?17:10
clarkb(because the issue aiui is that etherpad doesn't specify the encoding so we have to help it)17:10
clarkbbut maybe thats since been fixed in the software and is a non issue17:11
mordredclarkb: yeah - but our tables will be created from the mysqldump - which will17:11
mordredyeah17:11
mordredwell - I have no idea what etherpad will do if we let it do its own thing17:11
clarkbmordred: back when we had problems with it it was using utf8 but not understanding that utf8 is only 3 bytes wide17:11
clarkbso we did an end around etherpad and converted to utf8mb417:11
mordredclarkb: http://paste.openstack.org/show/791754/17:12
mordredthis is what is in the mysqldump17:12
mordredwhich is how the table wil get created17:12
fungithough it may still be worth solving for greenfield deployments17:12
fungithat can always come later17:12
mordredI mean - we could set it to mb4 in the file - I'm pretty sure it won't make any difference ... but it is a change from current prod and I'm not 100% sure how to test that it's not going to break something17:13
mordredI'm 99% sure it won't17:14
* mordred doesn't feel strongly - mostly just explaining thinking - happy to change it if that's what we want17:15
clarkbI don't feel strongly and as long as the tables explciitly set the 4 byte type in their schemas we shouldn't regress17:16
mordredclarkb: also - if it breaks, we can always change the settings and re-restore the dump :)17:17
* mordred is going to pull the trigger17:17
mordredclarkb, fungi : also - maybe it's time to land some more of these: https://review.opendev.org/#/q/topic:infra-prod-zuul+status:open ?17:33
clarkbmordred: is there a change to siwtch to a more frequent periodic pipeline yet? I'm looking at the next one in the stack and it is for zuul-preview https://review.opendev.org/#/c/717053/1 and if we want image updates (which just happened for zuul-preview) we need it to run semi frequently on a timer17:36
clarkboh thats https://review.opendev.org/#/c/717064/3/.zuul.yaml should we maybe stick that lower in the stack?17:36
mordredclarkb: want me to rebase it for that? or we could just land the whole stack up til that one too17:39
clarkbmordred: I worry that if we run into problems as we go that we might miss that? doesn't look like much of the stack has reviews atthis point so we don't lose much in a rebase (do have to wait for jobs to rerun though)17:41
clarkbwe can land some and see where we end up I guess17:42
fungithough to take the zuul-preview case as an example, we'd eventually trigger the deployment from promote after the image upload too, right?17:43
fungiso that we eventually reach the instant-gratification model17:43
clarkbmordred: for review I'm looking at the dep list at https://review.opendev.org/#/c/717054/1/.zuul.yaml do we need to set up dependences between manage-rpojects too?17:44
fungii guess that depends on the earlier discussion about triggering deployments from events for other projects17:44
clarkbfungi: we won't because zuul image builds happen in a different tenant17:44
fungiahh, yeah that too17:44
clarkbwhich is fine we can continue to just have an hourly cron for external dep updates17:44
fungiso zuul-preview is not really a case of an internal dep, right17:45
fungibecause we're not building our own images, we're using the zuul images17:45
mordredyeah17:46
clarkbmordred: left the question about deps on https://review.opendev.org/#/c/717054/1 after soem thought I think the semaphore avoids any major issues I can think with that17:46
mordredclarkb: re: manage-projects ... hrm17:46
mordredyeah17:46
clarkbmordred: so I +2'd it with a note17:46
mordredclarkb: I've got a rebased stack locally if you want me to push it up (with the hourly pushed earlier)17:47
openstackgerritMerged opendev/system-config master: Mention new mailing lists  https://review.opendev.org/71782417:47
clarkbmordred: maybe push it up between afs and puppet else?17:47
clarkbseems like puppet else is another good pause point since thats a bit of a change functioanlly17:47
mordrednah - I rebased it for right after zuul-preview17:47
clarkbthat works too if you want, I've only gotten to review so ar17:47
mordredso we can start off with zuul-preview going hourly17:47
mordredkk. one sec17:48
clarkb++17:48
openstackgerritMonty Taylor proposed opendev/system-config master: Run zuul and nodepool related deploys hourly  https://review.opendev.org/71706417:48
openstackgerritMonty Taylor proposed opendev/system-config master: Run review and review-dev in zuul  https://review.opendev.org/71705417:48
openstackgerritMonty Taylor proposed opendev/system-config master: Run gitea in zuul  https://review.opendev.org/71705517:48
openstackgerritMonty Taylor proposed opendev/system-config master: Run AFS in zuul  https://review.opendev.org/71705617:48
openstackgerritMonty Taylor proposed opendev/system-config master: Run remote-puppet-else in zuul  https://review.opendev.org/71705717:48
openstackgerritMonty Taylor proposed opendev/system-config master: Remove run_all.sh and ansible cron job  https://review.opendev.org/71705817:48
openstackgerritMonty Taylor proposed opendev/system-config master: Remove ansible-cron role  https://review.opendev.org/71705917:48
openstackgerritMonty Taylor proposed opendev/system-config master: Trigger everything on inventory changes  https://review.opendev.org/71711417:48
clarkband I'll have to switch gears to meeting prep soon enough so biting off a reasonable chunk here seems worthwhile17:48
mordredfungi: https://review.opendev.org/#/c/717064 look ok to you?17:57
fungimordred: yep, that's pretty straightforward18:05
mordredwoot18:05
openstackgerritMerged opendev/system-config master: Make a new dockerized etherpad.opendev.org  https://review.opendev.org/71644218:06
openstackgerritMerged opendev/system-config master: Run zuul-preview in zuul  https://review.opendev.org/71705318:06
*** hashar has quit IRC18:10
mnaserinfra-root: http://site.925bfe37815144d0859f260605d5fb98.zuul.zuul-preview.opendev.org is giving me an internal server error -- anyone happen to be able to pull out the error message?18:19
* mordred looking18:20
clarkbmnaser: https://zuul.opendev.org/t/zuul/build/925bfe37815144d0859f260605d5fb98 I haven't looked at the error message but that doesn't seem to have a url18:21
clarkb(is that a rendering issue?)18:22
mordredwow. it straight up got a core18:22
mordred[Tue Apr 07 17:15:30.780412 2020] [core:notice] [pid 15:tid 140579200193792] AH00094: Command line: '/usr/sbin/apache2 -D FOREGROUND -e info'18:22
mordredalso - we should bind-mount a logs dir18:22
clarkbmordred: the zuul-preview does just assume that artifacts[i][url] will be present if the type matches18:23
clarkbmordred: which could result in a core dump?18:23
clarkb(I point that out becuse we don't render a url entry for that artifact in the link above)18:23
mordredI agree18:24
mnaseroh, uh18:24
mnaserthe type wasn't even picked up -- https://zuul.opendev.org/api/tenant/zuul/build/30de77395f8b4ade9b1489cf4de2455918:24
mordredremote:   https://review.opendev.org/718186 Check to make sure artifact has a url18:25
mordredclarkb, corvus : ^^18:25
clarkbmnaser: thats a different build18:25
mnaseroh, yes, you're right18:25
clarkbhttps://zuul.opendev.org/api/tenant/zuul/build/925bfe37815144d0859f260605d5fb98 does how the url18:26
clarkbso its a rendering issue18:26
mordrednod18:26
clarkbok not that problem then :)18:26
mnaserhttps://zuul.opendev.org/api/tenant/zuul/build/30de77395f8b4ade9b1489cf4de2455918:26
clarkbmordred: we probably do still want to check the validity of the inputs though for nicer logging :)18:26
mnasermaybe i am doing something silly18:26
clarkbmnaser: why do you keep looking at that other build?18:27
mnaseruhh, no idea.18:27
mnaserah, ok18:28
clarkbmnaser: that second build is from yesterday18:28
mordredclarkb: yeah - I think that patch is still correct, but I don't know that it fixes this18:28
clarkbI'm not sure it is new enough to have the info you want18:28
mnasermy browser had "preserve log" option selected18:28
mnaserso it wasn't wiping the request history when i was F5ing18:29
mordredthis zuul-preview has been running for 4 days18:29
mordredgah18:29
mordredthis zuul-preview has been running for 4 hours18:29
mordredso - perhaps we haven't restarted since the new patch landed?18:29
clarkbmordred: ok so maybe just needs to be updated to the new code18:29
mnaseroh so maybe it hasn't picked up the change18:29
mnaserit _should_ have been updated by now18:29
mnaseroh, maybe it's in the emergency file?18:29
mordredwe're running on sha 5a13fb00de7a5387af917943982a9be832bf5c200c432b8e66ad5325a812891618:29
mordred(docker sha)18:30
clarkbmnaser: corvus removed it, but we're also changing it from cron to cd (and need cron for image deployment)18:30
clarkbits possible it just got caught in that18:30
mordredlatest image is 216be64cf86cd46cb56ee577909b0a189e33b3825e0194615b6b23f4104e533018:30
mordredmaybe I should do a compose up?18:31
mnaseroh ok, sorry i haven't followed that effort too closely18:31
openstackgerritMonty Taylor proposed opendev/system-config master: Bind-mount apache log dir to zuul-preview  https://review.opendev.org/71818718:33
mordredclarkb: want me to do a manual restart?18:34
clarkbmordred: we are landing the cron right? shoudl we wait for that to ensure its working?18:34
mordredmaybe so, yeah18:34
mordredmnaser: this is all part of "make things so that people who are not infra-root can see more of what's going on"18:35
clarkbmordred: are the uid and gid stable in https://review.opendev.org/#/c/718187/1/playbooks/roles/zuul-preview/tasks/main.yaml ?18:35
clarkbmordred: does that depend on the order software is installed?18:35
mnasermordred: i figured so, that's pretty good.  it'll make life easier18:35
mordredclarkb: they're just the uid/gid of the www-data user - I'm not sure how stable that is18:36
clarkbmordred: ya I'm trying to noodle on that. We can't say user: www-data beacuse that is form the hosts context18:36
mordredyup18:36
clarkbbut the container could be rebuilt and change the uid18:36
mordredyup18:36
mordredI mean - I don't think it's LIKELY that that happens18:36
clarkbthinking out loud here, would it be better to have logs go to journald?18:37
clarkb(via stdout/stderr?)18:37
mordredclarkb: it's not a contiguous set of ids18:37
mordredclarkb: so it's not like "33 is the next in the list after 32"18:37
mordredI thnk it might be a base debian user18:37
clarkbah so maybe ubuntu does fix that18:37
mnaseri think www-data being 33 is like a _thing_18:37
clarkbor debian in this case18:37
mordredyeah18:37
clarkbk18:37
mordredin fact, given that we also run on ubuntu - we could PROBABLY re-write that as www-data18:37
mordredin fact, why don't we do that18:38
openstackgerritJeremy Stanley proposed opendev/system-config master: Add IRC logs and ML subscribe links to opendev.org  https://review.opendev.org/71818818:38
mordredNEVERMIND18:38
clarkbfungi: wasn't there a change that already did ^18:38
mnaserim not sure if the apache2 pkgs add that18:38
mordredlogs are opened as root before the the user changes18:39
openstackgerritMonty Taylor proposed opendev/system-config master: Bind-mount apache log dir to zuul-preview  https://review.opendev.org/71818718:39
clarkboh its links this time got it18:39
mordreddont' need to set ownership at all18:39
fungiclarkb: yeah, i realized in looking at the page just now that we give people the e-mail address of, like, the announcements ml, but that's probably not as useful on its own18:39
fungileaving figuring out how to find where to subscribe that as an exercise for the reader was probably not ideal18:40
openstackgerritMonty Taylor proposed opendev/system-config master: Run zuul and nodepool related deploys hourly  https://review.opendev.org/71706418:41
openstackgerritMonty Taylor proposed opendev/system-config master: Run review and review-dev in zuul  https://review.opendev.org/71705418:41
openstackgerritMonty Taylor proposed opendev/system-config master: Run gitea in zuul  https://review.opendev.org/71705518:41
openstackgerritMonty Taylor proposed opendev/system-config master: Run AFS in zuul  https://review.opendev.org/71705618:41
openstackgerritMonty Taylor proposed opendev/system-config master: Run remote-puppet-else in zuul  https://review.opendev.org/71705718:41
openstackgerritMonty Taylor proposed opendev/system-config master: Remove run_all.sh and ansible cron job  https://review.opendev.org/71705818:41
openstackgerritMonty Taylor proposed opendev/system-config master: Remove ansible-cron role  https://review.opendev.org/71705918:41
openstackgerritMonty Taylor proposed opendev/system-config master: Trigger everything on inventory changes  https://review.opendev.org/71711418:41
mordredmnaser: https://review.opendev.org/717064 is the thing for zuul-preview to run hourly18:41
clarkbas I prep for the meeting today, what are thoughts on #startmeeting name? stick with infra for now then rename later in a batch to keep logs around or?18:41
mordredclarkb: yeah18:42
mnasermordred: ok cool, will i have to wait for that to land and all?18:42
mordredclarkb: or - we could do opendev today18:42
mordredclarkb: and keep the old ones as infra18:42
mordredif we want to track the historical name18:42
* mordred could go either way18:42
clarkbmordred: actually since the schedule still calls us infra (it probably has the channel wrong too I'll update that) I think sticking with infra today may be best (its where people will look for logs I expect)18:43
clarkbthenwe can cahnge it in the future if necessary18:43
mordredmnaser: yeah - sorry - I think since zuul-preview changes slowly this is actually a good example case to make sure the automation is working18:43
mordredclarkb: ++18:43
mordredmnaser: otherwise we might have to do this all again next time :)18:43
mordred(and sorry for the delay)18:43
mnaserthat's fine, so step #1 is wait for that change to land and step #2 -- give it an hour or two?18:44
mordredyeah18:44
mnaserok cool, i'll work on some other stuff and then let this run18:45
openstackgerritMonty Taylor proposed opendev/system-config master: Run review and review-dev in zuul  https://review.opendev.org/71705418:46
openstackgerritMonty Taylor proposed opendev/system-config master: Run gitea in zuul  https://review.opendev.org/71705518:46
openstackgerritMonty Taylor proposed opendev/system-config master: Run AFS in zuul  https://review.opendev.org/71705618:46
openstackgerritMonty Taylor proposed opendev/system-config master: Run remote-puppet-else in zuul  https://review.opendev.org/71705718:46
openstackgerritMonty Taylor proposed opendev/system-config master: Remove run_all.sh and ansible cron job  https://review.opendev.org/71705818:46
openstackgerritMonty Taylor proposed opendev/system-config master: Remove ansible-cron role  https://review.opendev.org/71705918:46
openstackgerritMonty Taylor proposed opendev/system-config master: Trigger everything on inventory changes  https://review.opendev.org/71711418:46
openstackgerritMonty Taylor proposed opendev/system-config master: Update typo on infra-prod-service-letsencrypt  https://review.opendev.org/71818918:46
mordredclarkb, fungi : please to see https://review.opendev.org/71818918:47
mordredhttps://review.opendev.org/#/c/717053 <-- second to last zuul comment is what got me there18:48
mordredclarkb: also - I had to rebase a merge conflict out so could you re:+2 https://review.opendev.org/#/c/717054 ?18:52
openstackgerritMonty Taylor proposed opendev/system-config master: Remove leftover /var/run dir creation  https://review.opendev.org/71713218:53
clarkbmordred: done18:58
corvusmordred, clarkb: apache logs go to stdout, so docker logs gets them19:00
mordredcorvus: not the error logs19:01
mordredcorvus: at least - I found things by looking in /var/log/apache2 inside the ocntainer that docker logs did not show me19:01
corvusoh neat19:01
mordredyeah19:02
corvuswell, i could go either way on bind mounting them or also redirecting them to stdout19:02
corvusdoes something rotate them?19:02
mordredno - we'd need to add that19:02
corvusso right now, the error logs could fill up the container19:03
corvusthough it only has 2 lines in it, so it would happen very slowly19:04
mordredcorvus: yeah - eventually in the fullness of time19:04
mordredand I mean we should only error log when we throw a core19:04
corvusoh it's meeting time19:05
mordredcorvus: crap - I forgot - what's teh new location?19:05
corvus#opendev-meeting19:05
openstackgerritSorin Sbarnea proposed zuul/zuul-jobs master: tox: allow tox to be upgraded  https://review.opendev.org/69005719:07
ianw"For an example of OpenID Connect at work, look at Google+ Sign-In, Google’s flagship social-identity offering" ... that didn't age well19:39
fungiheh19:39
*** ralonsoh has quit IRC19:40
mordredhaha19:44
clarkbI had to delete an index with a corrupted translog from elasticsearch. It was just bouncing that shard around to all the nodes trying to recover it and failing19:54
clarkbonce I did that the cluster went green and now it is rebalancing onto es02 (which had crashed)19:54
openstackgerritOleksandr Kozachenko proposed openstack/project-config master: Add vexxhost/ansible-role-base-server  https://review.opendev.org/71819919:56
openstackgerritMerged opendev/system-config master: Run zuul and nodepool related deploys hourly  https://review.opendev.org/71706420:00
clarkbI need to find lunch now20:00
fungiyeah, firing up the wok now20:02
mordredmnaser: ^^20:04
mordredmnaser: we should see the zuul-preview job run within the next hour20:05
mnasermordred: cool!  i'm excited :D20:05
* mnaser feels like they're opening all the old boxes of stuff that was never quite polished20:05
* mordred appreciates the adding of the polish!20:09
openstackgerritMerged opendev/system-config master: Remove leftover /var/run dir creation  https://review.opendev.org/71713220:09
mordredfungi, ianw : not to re-adjudicate things we've already thought about several times ....20:19
mordredbut I just had a thought20:19
openstackgerritMerged opendev/system-config master: Update typo on infra-prod-service-letsencrypt  https://review.opendev.org/71818920:19
mordredthe new python-stow element does a nice job of providing a non-invasive mechansim for pre-installing python versions that is still behind a decent enough interface that we can check for it in zuul-jobs20:20
mordredwhat if we did similar for tox and pip -so that ensure-tox could check to see if there is a tox in stow and if so "install" it via stow20:21
mordred(I thin this builds on top of waht ianw has been talking about with the pip/virtualenv rework)20:21
mordredbecause we could do "python -m venv /usr/local/stow/tox ; /usr/local/stow/tox/bin/pip install tox" - then have ensure-tox look to see if can use stow to install that tox.20:22
mordredWAIT20:22
mordrednevermind20:22
mordredignore all of those words20:22
mordredthat won't work at all - our previous plans are still better20:22
* mordred goes back into hiding20:22
*** hashar has joined #opendev20:38
openstackgerritIan Wienand proposed zuul/zuul-jobs master: [wip] ensure-tox: preinstall pip  https://review.opendev.org/71766320:38
openstackgerritIan Wienand proposed zuul/zuul-jobs master: Update Fedora to 31  https://review.opendev.org/71765720:38
openstackgerritIan Wienand proposed zuul/zuul-jobs master: [wip] ensure-pip: export ensure_pip_virtualenv_command  https://review.opendev.org/71822420:39
openstackgerritIan Wienand proposed zuul/zuul-jobs master: [wip] use ensure-pip in fetch-subunit-output test  https://review.opendev.org/71822520:39
ianwmordred: if i had to distill my current thinking, which i reserve the right to change :) ...20:40
ianwzuul-jobs roles currently use pip: and frequently the virtualenv: field of pip: without consideration of pip, it's dependencies, or virtualenv being installed20:41
ianwso, we add a ensure-pip role that installs pip, from packages, and exports a variable that works as a virtualenv_command (usually, "/usr/bin/python3 -m venv")20:42
ianwif "/usr/bin/python3 -m venv" doesn't exist on the host (and at this point, I can't think of an environment we support that it doesn't) it exports "virtualenv" and every role is in the same position it is in now; running with the hope/assumption that virtualenv is installed and working20:44
fungipresumably we need to recommend somewhere that people incorporate the python3-venv package into their debian derivative (ubuntu, et cetera) images, since it generally isn't installed by default20:46
clarkbthe outdoors are not quite as warm as I had hoped20:46
ianwfungi: i think that (and curretly done in wip changes) the ensure-pip role can reasonably install that package as a pip dependency20:47
fungigot it20:48
ianwfungi: although, on our systems, it does not install as such, because it's already there on dib images20:48
fungigranted, people don't need system-context pip installed to have system-context venv20:48
ianwwhat i *don't* want to get this role into is the game of installing virtualenv ... because i feel this is ambiguous.  "virtualenv" is ambiguous between python2 and 320:49
fungigranted, if folks want venv and no system-wide pip, we can always break that role up further20:50
clarkb"Elsa never gets cold" I've been told to deal wiht it being chilly21:09
corvusclarkb: i left a little -1 on https://review.opendev.org/71816121:09
clarkbcorvus: k will update21:09
openstackgerritClark Boylan proposed opendev/system-config master: Improving logging of prod playbook jobs  https://review.opendev.org/71816121:10
clarkbcorvus: ^ fixed21:10
corvusclarkb: did you attempt to explain how you and elsa may be different?  or would that only further highlight your failings?21:10
clarkbcorvus: ya I think that would only further highlight my failings21:11
openstackgerritIan Wienand proposed zuul/zuul-jobs master: [wip] ensure pip role  https://review.opendev.org/71763921:11
openstackgerritIan Wienand proposed zuul/zuul-jobs master: [wip] ensure-pip: export ensure_pip_virtualenv_command  https://review.opendev.org/71822421:11
openstackgerritIan Wienand proposed zuul/zuul-jobs master: [wip] fetch-zuul-cloner: use ensure-pip  https://review.opendev.org/71788221:11
openstackgerritIan Wienand proposed zuul/zuul-jobs master: [wip] use ensure-pip in fetch-subunit-output test  https://review.opendev.org/71822521:11
openstackgerritIan Wienand proposed zuul/zuul-jobs master: [wip] ensure-tox: preinstall pip  https://review.opendev.org/71766321:11
openstackgerritIan Wienand proposed zuul/zuul-jobs master: Update Fedora to 31  https://review.opendev.org/71765721:11
corvus"A-- not elsa"21:11
mordredcorvus: if I get the error: Exception: Job infra-prod-service-nodepool depends on infra-prod-update-system-config which was not run.21:26
mordredcorvus: because we don't have that in the opendev-prod-hourly pipeline - obviously I can (and should) add it to the pipeline21:27
mordredcorvus: however - there are three other soft depends - it's safe to not add them to the pipeline because they're soft? or do I need to change the dependencies in that pipeline invocation?21:27
corvusmordred: i think you need to change the deps21:28
mordredcorvus: ok.21:28
corvusi believe that error comes when it's constructing the job graph for the item, so it wants to see everything in the graph, even if the links are optional21:29
openstackgerritMonty Taylor proposed opendev/system-config master: Add sytem-config-update and remove other deps from hourly  https://review.opendev.org/71823021:30
mordredcorvus, clarkb : ^^ that should fix our hourly pipeline entries21:30
clarkbmordred: that is making me wonder if we should put all the deps into the pipelines?21:32
openstackgerritmelanie witt proposed openstack/project-config master: Update grafana for ceph to reflect current jobs  https://review.opendev.org/71823221:32
clarkbas variants I mean21:32
mordredclarkb: yeah- we've been waffling back and forth on that as to which thing is better21:40
mordredclarkb: I think I'm starting to waffle back to "let's put them all in the pipelines" - because the deps really do frequently depend on the pipeline it's in in this case21:41
mordredthat said - we'll lose the ability to get the common deps that are shared by the majority of these jobs21:41
mordredso there will be a lot more yaml anchor :)21:41
mordredclarkb: I'll make a patch that does that on the end of the stac21:42
openstackgerritIan Wienand proposed zuul/zuul-jobs master: [wip] fetch-zuul-cloner: use ensure-pip  https://review.opendev.org/71788221:45
openstackgerritIan Wienand proposed zuul/zuul-jobs master: [wip] use ensure-pip in fetch-subunit-output test  https://review.opendev.org/71822521:45
openstackgerritIan Wienand proposed zuul/zuul-jobs master: [wip] ensure-tox: preinstall pip  https://review.opendev.org/71766321:45
openstackgerritIan Wienand proposed zuul/zuul-jobs master: Update Fedora to 31  https://review.opendev.org/71765721:45
openstackgerritMerged opendev/system-config master: Add IRC logs and ML subscribe links to opendev.org  https://review.opendev.org/71818821:49
corvusmordred: the etherpad.opendev.org patch merged; what's next on that?21:55
mordredcorvus: 2 things21:56
mordredcorvus: we need to run the playbook - because the job had an error in its dependencies21:56
mordredcorvus: (this is coming to be a running refrain)21:56
mordredbut I think we've fixed that now21:57
clarkbmordred: have you been doing that via zuul enqueue or running the playbook on bridge directly?21:57
mordredand then there is a db dump from production I was going to run in21:57
mordredclarkb: so far neither - when we fix things I've either waited for a periodic run or a new natural triggering event21:57
mordredin this case it's likely more expedient to run the playbook by hand21:57
clarkbrgr21:58
mordredso - shall I do that real quick?21:58
corvusmordred: ++21:58
mordredrunning21:58
mordredI have stopped running it21:59
mordredI'm going to send up a patch21:59
openstackgerritMonty Taylor proposed opendev/system-config master: Only run etherpad playbook on new server  https://review.opendev.org/71823522:00
mordredclarkb, corvus : ^^ that's important :)22:00
fungiokay, done making/eating dinner, back to pitch in22:00
corvusthen i think we can test it out -- then do we shut down prod etherpad, add a cname, and do the db dump/load?  maybe that should be scheduled?  or we could just do it real quick like some afternoon evening?22:00
fungijust in time it seems!22:00
clarkbcorvus: it sounded like fungi was saying shceduling it would be a good idea22:00
corvusmordred: comment on that22:00
corvuslooks like etherpad.openstack.org has a 3600 ttl; we should lower that to 30022:01
corvusthen we can take a 5 min outage to deal with the ttl and the db dump22:02
fungischeduling mainly so we can announce it, in case there are unexpected behavior changes with the new version. but maybe we can just announce it when we do it and tell folks to let us know if there are problems22:02
corvusi'll go lower the ttl in rax22:02
mordredcorvus: I'm not sure which syntax is correct there - let's just do it with etherpad01 for now - and switch it back to the etherpad group when it's done22:02
corvusmordred: k22:02
mordredI think that's liekly to happen before we grow an 0222:02
mordredfwiw - nothing new was installed on etherpad01.openstack.org - I caught the ansible quickly enough22:02
mordredin positive news - that patch should run correctly in zuul22:03
mordredin case I'm AFK when taht playbook runs and someone wants to run in the dump - it's in /opt/db on etherpad01.opendev.org22:04
mordredwe should be able to do cat /opt/db/etherpad.sql | docker-compose exec mariadb mysql22:05
mordredspeaking of - I'm now going to AFK for the evening quarantine walk22:06
openstackgerritMerged opendev/system-config master: Improving logging of prod playbook jobs  https://review.opendev.org/71816122:07
corvus#status log lowered etherpad.openstack.org dns ttl to 300 seconds22:07
openstackstatuscorvus: finished logging22:07
corvusmordred: +2 on the whole infa-prod-zuul stack22:12
clarkbcorvus: mordred why doesn't the puppet remote else job depend on update system-config?22:27
clarkbleft that as a question on the change (I think we do want that dep since a lot of our puppet is in system-config)22:29
fungii thought we already covered that22:29
fungireminding myself what the answer was...22:29
fungioh, i think it was because it will have already run first on its own but i'll need to revisit why that was22:31
clarkbfungi: thats the case across different changes but in this case you would update puppet for something and you'd need system-config to be updated on bridge for that specific change I think22:31
openstackgerritMerged opendev/system-config master: Add sytem-config-update and remove other deps from hourly  https://review.opendev.org/71823022:32
corvusclarkb: oh that's a good point... i think maybe you're right, so i removed my +222:33
clarkbit conflicts with that change that just merged anyway so new ps if necessary isn't a big deal :)22:33
openstackgerritIan Wienand proposed zuul/zuul-jobs master: [wip] ensure-tox: use ensure-pip role  https://review.opendev.org/71766322:35
openstackgerritIan Wienand proposed zuul/zuul-jobs master: Update Fedora to 31  https://review.opendev.org/71765722:35
*** DSpider has quit IRC22:51
openstackgerritMerged opendev/system-config master: Only run etherpad playbook on new server  https://review.opendev.org/71823522:52
*** hashar has quit IRC22:56
ianwclarkb: do you remember what the inspiration for the tox env installed via element was -> https://review.opendev.org/#/c/713017/ ?23:08
clarkbianw: I think it was just to have common tools preinstalled without polluting the global namespace23:11
clarkbianw: so we can have tox there ready to go without breaking the system package manager23:11
ianwclarkb: is there a reason we can't do that in zuul-jobs?23:12
clarkbianw: I'd like to avoid hitting pypi that much23:12
clarkbwe don't necessarily need to, but if we can identify common things and take some of that load off of pypi I think it is a good thing23:13
clarkb(particularly after their plea to have more mirrors out there)23:13
ianwwe should be going via our cache, though?23:13
clarkbianw: the indexes have a really short ttl so we hit them (that is where we typically see our issues these days)23:13
*** mlavalle has quit IRC23:15
clarkbthe biggest downside is probably the cases where upstream releasing new versions that break us causesome disruption23:16
clarkband we need to spin out new images rather than update a flag on the jobs23:16
ianwok, gonna have to think on this ... thanks23:18
clarkbfwiw I can probably be convinced the other way around is better too. It is entirely possible my thinking is not sustainable :)23:19
clarkbfor us in this particular case I think its balancing job runtime and pypi requests against being able to quickly change the version of what is installed23:20
*** tosky has quit IRC23:26
fungii get the desire to be pragmatic and pre-cache certain bits of software on images which are heavily used by lots of the jobs we run, but i do wonder where the line is and how we decide when this tool or that tool should be preinstalled23:42
fungiespecially as opendev's horizon's broaden and openstack (or python software in general) becomes less of a percentage of what we're testing23:43
fungiis there a point at which we drop tox, or at which we start adding compilers and interpreters for other languages?23:44
clarkbya I think tox is a bit special becaus its basically the version of make most of our users use (and I think we preinstlal make make as well)23:44
clarkbits definitely an optimization for a significan't chunk of the jobs we run23:45
clarkbfungi: personally I've long wished we would just use make :)23:45
clarkbI even wrote a makefile to approximate tox things23:45
fungiyeah, tox seems like an overly-python-specific make without the flexibility or decades of stability behind it23:46
mnaserhmm23:59
mnaserhttp://site.925bfe37815144d0859f260605d5fb98.zuul.zuul-preview.opendev.org still 500s -- would someone be nice enough to check if the playbook was updated?23:59

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!