Tuesday, 2019-04-23

openstackgerritClark Boylan proposed opendev/system-config master: Double stack size on gitea  https://review.opendev.org/65463400:00
clarkbthere we go00:00
clarkbI'm gonna go track down dinner now00:00
clarkbbut will try to keep an eye on ^ as fixing that will be nice00:00
*** ijw has quit IRC00:17
*** mattw4 has quit IRC00:23
*** michael-beaver has quit IRC00:23
*** gyee has quit IRC00:29
openstackgerritMerged opendev/system-config master: Use swift to back intermediate docker registry  https://review.opendev.org/65361300:30
*** mriedem has quit IRC00:30
*** dave-mccowan has joined #openstack-infra00:35
*** Weifan has quit IRC00:42
*** markvoelker has joined #openstack-infra00:51
*** jamesmcarthur has quit IRC00:54
*** smarcet has joined #openstack-infra01:00
*** whoami-rajat has joined #openstack-infra01:01
*** ricolin has joined #openstack-infra01:15
*** diablo_rojo has quit IRC01:16
*** smarcet has quit IRC01:24
*** smarcet has joined #openstack-infra01:28
mordredclarkb, corvus: sorry - was AFK way more today than I originally expected - had to deal with a bunch of family stuff - I think I'm caught up on openstack/openstack, stack sizes - and intermediate registries from scrollback - nice work on all of that01:36
*** rlandy|ruck has quit IRC01:46
*** hwoarang has quit IRC01:55
*** hwoarang has joined #openstack-infra02:00
clarkbmordred: isnt that a fun git bug?02:02
*** nicolasbock has quit IRC02:04
*** _erlon_ has quit IRC02:05
*** ykarel|away has joined #openstack-infra02:26
*** anteaya has joined #openstack-infra02:34
*** Weifan has joined #openstack-infra02:44
*** dklyle has quit IRC02:45
*** dklyle has joined #openstack-infra02:46
*** Weifan has quit IRC02:48
*** dave-mccowan has quit IRC02:48
*** bhavikdbavishi has joined #openstack-infra03:04
*** ykarel|away is now known as ykarel03:06
*** hongbin has joined #openstack-infra03:10
*** bhavikdbavishi has quit IRC03:10
*** Qiming has quit IRC03:17
*** hwoarang has quit IRC03:18
*** rh-jelabarre has quit IRC03:19
*** hwoarang has joined #openstack-infra03:24
*** bhavikdbavishi has joined #openstack-infra03:28
*** zhangfei has joined #openstack-infra03:39
*** zhangfei has quit IRC03:40
*** zhangfei has joined #openstack-infra03:41
*** lpetrut has joined #openstack-infra03:50
*** yamamoto has quit IRC04:11
*** yamamoto has joined #openstack-infra04:12
*** lpetrut has quit IRC04:13
*** ramishra has joined #openstack-infra04:15
*** yamamoto has quit IRC04:19
*** udesale has joined #openstack-infra04:22
*** yamamoto has joined #openstack-infra04:33
*** hongbin has quit IRC04:33
*** zhangfei has quit IRC04:43
*** markvoelker has quit IRC04:57
*** Weifan has joined #openstack-infra05:00
*** Weifan has quit IRC05:00
*** ykarel is now known as ykarel|afk05:01
*** ykarel|afk has quit IRC05:06
*** raukadah is now known as chandankumar05:11
*** jaosorior has joined #openstack-infra05:17
*** yamamoto has quit IRC05:21
*** zhurong has joined #openstack-infra05:23
*** ykarel|afk has joined #openstack-infra05:23
*** yamamoto has joined #openstack-infra05:23
*** ykarel|afk is now known as ykarel05:24
*** yamamoto has quit IRC05:26
*** yamamoto has joined #openstack-infra05:27
*** yamamoto has quit IRC05:27
*** ykarel_ has joined #openstack-infra05:28
*** ykarel has quit IRC05:31
*** armax has quit IRC05:32
*** kjackal has joined #openstack-infra05:33
*** ccamacho has quit IRC05:46
dangtrinhntHi infra time. Right now the default topic of #poenstack-fenix channel is a little weird. I would like to change that but looks like I don't have enough privileges to do that. If someone can help, it would be great. Many thanks.05:47
dangtrinhntInfra Team.05:47
*** quiquell|off is now known as quiquell|rover05:49
*** yamamoto has joined #openstack-infra05:51
*** kjackal has quit IRC05:56
*** lpetrut has joined #openstack-infra06:00
*** pcaruana has joined #openstack-infra06:06
AJaegerconfig-core, here's a change to use py36 for some periodic jobs - please put on your review queue: https://review.opendev.org/65457106:08
*** electrofelix has joined #openstack-infra06:14
iceyI think I'm missing a project after the openstack->opendev migration? when I try to `git review ...` I get "fatal: Project not found: openstack/charm-vault ... fatal: Could not read from remote repository." I'm guessing it's because it somehow moved into a namespace "x" on opendev.org (https://opendev.org/x/charm-vault)06:15
quiquell|roverhello, what's the replacement for https://git.openstack.org/cgit/... with opendev ?06:17
*** kjackal has joined #openstack-infra06:18
*** dpawlik has joined #openstack-infra06:18
iceyquiquell|rover: opendev.org seems to be06:21
*** slaweq has joined #openstack-infra06:21
*** yamamoto has quit IRC06:22
quiquell|roversshnaidm|afk: ^06:23
quiquell|roversshnaidm|afk: fixed reproducer with latests comments https://review.rdoproject.org/r/2037106:23
*** yamamoto has joined #openstack-infra06:25
*** yamamoto has quit IRC06:25
AJaegerquiquell|rover: the old https URLs should redirect06:26
*** yamamoto has joined #openstack-infra06:26
AJaegericey: yes, see all the emails on openstack-infra, openstack-discuss about OpenDev06:26
iceyAJaeger: I've seen the emails, I'm wondering why most of the openstack-charms stayed under openstack, and charm-vault moved :-/06:27
AJaegericey: you need to update your ssh remotes, we cannot redirect those.06:27
AJaegericey: charm-vault is not an official OpenStack project06:27
iceyAJaeger: interesting :-/06:27
AJaegericey: not listed here: https://governance.openstack.org/tc/reference/projects/openstack-charms.html06:28
iceyAJaeger: indeed - I suspect that's an oversight; annoying but thanks :)06:28
*** yamamoto has quit IRC06:29
*** yamamoto has joined #openstack-infra06:29
*** yamamoto has quit IRC06:29
AJaegericey: it's no oversight, see https://review.opendev.org/#/c/541287/ - the PTL rejected it to be part of official charm06:30
AJaegerbbl06:30
*** ykarel_ is now known as ykarel06:31
iceyI see, thanks again AJaeger06:31
ykarelLooks like OpenStack Release Bot sending wrong updates to .gitreview06:32
ykarelwithout rebasing06:32
ykarelsee some last updates:- https://review.opendev.org/#/q/owner:OpenStack+Release+Bot+gitreview06:33
ykarelinfra-root ^^ AJaeger ^^06:33
*** udesale has quit IRC06:35
*** dciabrin has joined #openstack-infra06:35
*** udesale has joined #openstack-infra06:37
ykarelOkk seems those are old reviews posted before migration06:38
ykarelbut merging those as it is without rebase will override .gitreview06:39
*** bhavikdbavishi has quit IRC06:39
*** bhavikdbavishi has joined #openstack-infra06:41
*** quiquell|rover is now known as quique|rover|brb06:42
*** udesale has quit IRC06:44
*** udesale has joined #openstack-infra06:44
*** zhangfei has joined #openstack-infra06:51
*** markvoelker has joined #openstack-infra06:52
*** pgaxatte has joined #openstack-infra06:58
AJaegerykarel: best talk with release team...06:58
ykarelAJaeger, ack, seems they are in some other timezone, already posted a issue this morning06:59
AJaegerykarel: patching those is fine as well ;)06:59
ykarels/a/other06:59
*** yamamoto has joined #openstack-infra07:03
*** ginopc has joined #openstack-infra07:08
*** yamamoto has quit IRC07:11
*** quique|rover|brb is now known as quiquell|rover07:12
*** arxcruz|off|23 is now known as arxcruz07:13
*** ccamacho has joined #openstack-infra07:13
*** iurygregory has joined #openstack-infra07:14
*** zhangfei has quit IRC07:15
*** zhangfei has joined #openstack-infra07:15
*** tosky has joined #openstack-infra07:17
*** udesale has quit IRC07:18
*** udesale has joined #openstack-infra07:19
*** ccamacho has quit IRC07:27
*** ccamacho has joined #openstack-infra07:27
*** yamamoto has joined #openstack-infra07:31
*** yamamoto has quit IRC07:31
*** fmount has quit IRC07:32
*** yamamoto has joined #openstack-infra07:33
openstackgerritJason Lee proposed opendev/storyboard master: WIP: Updated Loader functionality in preparation for Writer  https://review.opendev.org/65481207:33
*** fmount has joined #openstack-infra07:35
*** gfidente has joined #openstack-infra07:40
toskyAJaeger: uhm, huge bunch of wrong fixes for the opendev transition07:40
*** jpena|off has joined #openstack-infra07:43
*** jpena|off is now known as jpena07:43
*** ykarel is now known as ykarel|lunch07:45
openstackgerritBernard Cafarelli proposed openstack/project-config master: Update Grafana dashboards for stable Neutron releases  https://review.opendev.org/65335407:52
*** dtantsur|afk is now known as dtantsur07:55
*** jpich has joined #openstack-infra07:55
*** kjackal has quit IRC07:57
*** kjackal has joined #openstack-infra07:58
*** roman_g has joined #openstack-infra08:01
*** rpittau|afk is now known as rpittau08:08
*** helenafm has joined #openstack-infra08:08
*** gfidente has quit IRC08:12
*** lseki has joined #openstack-infra08:12
*** lucasagomes has joined #openstack-infra08:15
*** dikonoor has joined #openstack-infra08:28
*** derekh has joined #openstack-infra08:28
*** apetrich has joined #openstack-infra08:30
fricklerinfra-root: do we already have a plan to make opendev.org listen on IPv6? seems the lack of that is actively breaking things, see e.g. this paste posted in #-qa http://paste.openstack.org/show/749620/ . we might want to drop the AAAA record until we get that fixed08:31
*** ginopc has quit IRC08:33
*** rossella_s has joined #openstack-infra08:36
*** ginopc has joined #openstack-infra08:39
*** e0ne has joined #openstack-infra08:39
*** tkajinam has quit IRC08:42
*** mleroy has joined #openstack-infra08:52
*** ykarel|lunch is now known as ykarel08:52
*** dikonoor has quit IRC08:56
*** jbadiapa has joined #openstack-infra09:02
*** ginopc has quit IRC09:06
*** dikonoor has joined #openstack-infra09:09
*** gfidente has joined #openstack-infra09:20
*** kjackal has quit IRC09:24
*** kjackal has joined #openstack-infra09:24
*** lpetrut has quit IRC09:30
AJaegertosky: yeah, I handed out a few -1s ;(09:30
*** jaosorior has quit IRC09:31
*** amansi26 has joined #openstack-infra09:35
*** yamamoto has quit IRC09:40
*** Lucas_Gray has joined #openstack-infra09:42
*** gfidente has quit IRC09:46
*** yamamoto has joined #openstack-infra09:48
*** yamamoto has quit IRC09:53
*** jcoufal has joined #openstack-infra09:54
*** bhavikdbavishi has quit IRC09:59
*** gfidente has joined #openstack-infra10:00
*** kjackal has quit IRC10:01
*** jaosorior has joined #openstack-infra10:14
*** threestrands has quit IRC10:15
*** lseki has quit IRC10:16
*** kjackal has joined #openstack-infra10:16
*** gfidente has quit IRC10:32
*** sshnaidm|afk is now known as sshnaidm10:40
*** bhavikdbavishi has joined #openstack-infra10:44
*** amansi26 has quit IRC10:46
*** ginopc has joined #openstack-infra10:47
*** bhavikdbavishi has quit IRC10:56
*** nicolasbock has joined #openstack-infra11:00
*** jaosorior has quit IRC11:00
*** dikonoor has quit IRC11:03
aspiersgit.openstack.org[0: 23.253.125.17]: errno=No route to host11:08
aspiersgit.openstack.org[1: 2001:4800:7817:103:be76:4eff:fe04:e3e3]: errno=Network is unreachable11:08
aspiersinfra-root: is this expected post-transition?11:08
openstackgerritGhanshyam Mann proposed openstack/openstack-zuul-jobs master: Add python36-charm-jobs project template  https://review.opendev.org/65495411:09
*** happyhemant has joined #openstack-infra11:09
* aspiers reads through mail threads to see if he missed something11:09
*** yamamoto has joined #openstack-infra11:12
*** ykarel is now known as ykarel|afk11:13
frickleraspiers: no, those are expected to work and do work fine for me, maybe a local networking issue for you? we do have a known issue with opendev.org not responding via IPv6, though11:14
aspiersfrickler: I just saw the same issue reported above in the scrollback11:14
aspiersnope, this is not an IPv6 issue - see the above paste which is both IPv4 and v611:15
aspiersfrickler: http://eavesdrop.openstack.org/irclogs/%23openstack-infra/%23openstack-infra.2019-04-22.log.html11:16
aspiersclarkb: in case you didn't know, there's also git remote set-url these days; no need to remove and re-add11:17
frickleraspiers: yeah, I haven't read much scrollback yet. so you did have some git:// remote?11:17
aspiersfrickler: yes11:18
frickleraspiers: ah, o.k., that's really some weird way of reporting errors11:19
aspiersclarkb: although I'm not sure if that actually has much benefit, since I guess a remote remove won't immediately GC all the objects from that remote forcing a redownload11:19
aspiersfrickler: sorry, which way is weird?11:19
*** yamamoto has quit IRC11:19
frickleraspiers: oh, sorry, that could be misunderstood. I was talking about git, not about you11:21
aspiers:)11:21
aspiersfrickler: you mean No route to host?11:21
frickleraspiers: yes, that and network unreachable. they really should be "connection refused" instead11:21
aspiersfrickler: actually that error message comes straight from the OS via strerror(3), so I think it has to be correct11:24
*** bhavikdbavishi has joined #openstack-infra11:25
aspiersalthough I can ping 23.253.125.17 so it does seem weird11:25
aspiersthere must be something else strange going on11:25
*** bhavikdbavishi has quit IRC11:25
frickleraspiers: hmm, via tcpdump I see "ICMP host 23.253.125.17 unreachable - admin prohibited", that doesn't match to "No route to host" to me11:25
frickleraspiers: but yeah, maybe a kernel thing instead of git11:26
aspiersfrickler: in any case, a) you are saying it's supposed to work? and b) what's the correct future-proof git:// host to use?11:26
*** smarcet has quit IRC11:26
*** bhavikdbavishi has joined #openstack-infra11:26
frickleraspiers: no, the git:// variant is no longer working, since our new frontend doesn't support it. changing to http(s) is the correct way to fix this issue11:27
frickleraspiers: I was only confused by the error message11:28
aspiersah11:28
aspierswas it announced anywhere that git:// no longer works? if so I missed it11:28
aspiersif not, I fear you can expect a lot more questions about this11:28
fricklerit should have been in one of the mails early on11:28
*** rh-jelabarre has joined #openstack-infra11:29
fricklerthere was also a set of patches removing the git:// references from devstack and elsewhere11:29
fricklerlet me check the archives11:29
*** bhavikdbavishi1 has joined #openstack-infra11:30
aspiersit wasn't in an announcement, but it was buried in 3 followups within that large thread11:30
aspiershttp://lists.openstack.org/pipermail/openstack-discuss/2019-April/004921.html11:30
*** bhavikdbavishi has quit IRC11:31
*** bhavikdbavishi1 is now known as bhavikdbavishi11:31
*** apetrich has quit IRC11:32
*** zhangfei has quit IRC11:32
*** jpena is now known as jpena|lunch11:33
frickleraspiers: yeah, seems like this was a bit hidden, sorry for that11:33
aspiersnp :) just want to help you avoid a flood of duplicate questions11:34
aspiersfrickler: here's a nice workaround to advertise:11:35
aspiersgit config --global url.https://git.openstack.org/.insteadof git://git.openstack.org/11:35
AJaegerfrickler: could you review https://review.opendev.org/#/c/654954/ and https://review.opendev.org/654571 , please?11:35
*** quiquell|rover is now known as quique|rover|lun11:35
*** quique|rover|lun is now known as quique|rover|eat11:36
*** apetrich has joined #openstack-infra11:36
AJaegerargh, just gave -1 on 954...11:37
*** bhavikdbavishi has quit IRC11:42
*** bhavikdbavishi has joined #openstack-infra11:43
*** lyarwood has joined #openstack-infra11:44
aspiersOK this is like a million times nicer https://opendev.org/openstack/nova-specs11:48
aspierskudos corvus clarkb fungi and all of infra-root!11:49
openstackgerritMerged openstack/openstack-zuul-jobs master: Use py36 instead of py35 for periodic master jobs  https://review.opendev.org/65457111:50
openstackgerritGhanshyam Mann proposed openstack/openstack-zuul-jobs master: Add python36-charm-jobs project template  https://review.opendev.org/65495411:55
*** yamamoto has joined #openstack-infra11:57
*** panda is now known as panda|lunch11:57
*** quique|rover|eat is now known as quiquell|rover11:59
*** markvoelker has quit IRC12:01
*** rlandy has joined #openstack-infra12:06
*** boden has joined #openstack-infra12:07
*** rlandy is now known as rlandy|ruck12:07
fricklerinfra-root: the hashdiff-0.3.9 gem breaks beaker-trusty, previous passing job had hashdiff-0.3.8 http://logs.openstack.org/77/654577/1/gate/openstackci-beaker-ubuntu-trusty/dfa87dd/job-output.txt.gz#_2019-04-22_20_56_13_41969912:09
AJaegerfrickler: ah. Can we use that version for now?12:10
*** mriedem has joined #openstack-infra12:11
fricklerAJaeger: not sure, maybe someone with more puppet voodoo than myself can find a fix. otherwise I'd propose to make that job non-voting for now12:11
bodenhi. I'm trying to understand if/when we might expect "Hound" (code search) to work again? I sent a note to the ML http://lists.openstack.org/pipermail/openstack-discuss/2019-April/005481.html. but never saw a response about when it might be available12:11
*** jbadiapa has quit IRC12:22
*** gfidente has joined #openstack-infra12:28
*** jpena|lunch is now known as jpena12:31
openstackgerritSlawek Kaplonski proposed openstack/project-config master: Add openstacksdk-functional-devstack-networking job to Neutron dashboard  https://review.opendev.org/65299312:31
*** gfidente has quit IRC12:34
AJaegerconfig-core, could you review https://review.opendev.org/654574 as next step for py36 jobs, please? thanks!12:39
AJaegerboden: idea is to use https://opendev.org/explore/code instead of codesearch12:40
*** gfidente has joined #openstack-infra12:40
*** kaiokmo has joined #openstack-infra12:40
openstackgerritMerged openstack/openstack-zuul-jobs master: Add python36-charm-jobs project template  https://review.opendev.org/65495412:40
bodenAJaeger as of right now there doesn't seem to be parity... search results that I need return nothing with https://opendev.org/explore/code12:41
openstackgerritTobias Henkel proposed openstack/diskimage-builder master: Support defining the free space in the image  https://review.opendev.org/65512712:41
bodenAJaeger I don't see code explorer working for our needs as-is, unless I'm missing something12:41
*** kgiusti has joined #openstack-infra12:41
*** bhavikdbavishi has quit IRC12:42
bodenAJaeger for example I want to find all the requirements.txt files that have the string "neutron-lib-current", but it seems to not find anything https://opendev.org/explore/code?q=neutron-lib-current&tab=12:42
fricklerboden: you need to quote the "-" as "\-"12:42
bodenfrickler unless I'm missing something, it doesn't help12:44
bodenthere are at least 20 projects that have the string "neutron-lib-current" in their requirements.txt file...12:44
*** kjackal has quit IRC12:44
*** kjackal has joined #openstack-infra12:44
*** jamesmcarthur has joined #openstack-infra12:46
*** lseki has joined #openstack-infra12:46
fricklerboden: do you have an example? I can't seem to find one easily12:48
AJaegerfrickler: dragonflow/requirements.txt12:48
bodenfrickler https://github.com/openstack/networking-ovn/blob/master/requirements.txt#L2412:48
bodenfrickler, or as another example I want to find all uses of the (neutron) constant "ROUTER_CONTROLLER"... I can't find any, and there are some for sure12:49
*** xarses_ has joined #openstack-infra12:49
*** rh-jelabarre has quit IRC12:50
*** ykarel|afk is now known as ykarel12:51
*** aaronsheffield has joined #openstack-infra12:51
*** xarses has quit IRC12:52
fricklerboden: hmm, that's indeed strange. searching for some other term like "policy\-in\-code" seems to work fine. maybe https://opendev.org/explore/code?q=infra+initiatives&tab= can help you as a workaround for the first search. but there certainly seems to be an issue with terms containing "_"12:55
AJaegerboden: for your specific query: in openstack namespace it's neutron, neutron-lib, networking-odl. I don't have the namespace x checked out to grep there12:57
*** andreww has joined #openstack-infra12:58
*** xarses_ has quit IRC13:01
*** jpich has quit IRC13:01
*** gfidente has quit IRC13:01
*** smarcet has joined #openstack-infra13:01
*** jpich has joined #openstack-infra13:02
openstackgerritMonty Taylor proposed zuul/nodepool master: Update devstack settings and docs for opendev  https://review.opendev.org/65423013:03
mordredfrickler: I pushed up https://review.opendev.org/655133 which I *think* should fix that ^^13:03
mordredboden: I was out yesterday so I'm not 100% caught up on hound - I believe clarkb was looking at it yesterday though13:04
*** eharney has quit IRC13:04
AJaegermordred: could you review a small py35->36 change, please? https://review.opendev.org/#/c/654574/13:05
mordredwe'd definitely LIKE to replace it with the gitea codesearch - but I think more work might need to go in to that to make it suitable (there is now upstream support for pluggable search backends and we'd like to get an elasticsearch backend put in there, for instance)13:05
AJaegerthanks, mordred13:07
bodenfrickler I'm not sure we can count on "infra initiatives" being there... AJaeger we also have projects in the x/ namespace that we need to search13:09
bodenjust as a heads up this will certainly impact some of our work on neutron blueprints since we need the ability to search cross projects for impacts13:10
openstackgerritMerged openstack/project-config master: Use py36 instead of py35 for periodic master jobs  https://review.opendev.org/65457413:13
clarkbmordred: boden I've not had achance to look athound yet. I think it updates based on projects.yaml but needs a restart? unsure. I'm likely to followup on git stack size segfaults for openstack/openstack first today then can start looking athound13:14
openstackgerritAndreas Jaeger proposed openstack/openstack-zuul-jobs master: Remove openstack-tox-py35-with-neutron-lib-master  https://review.opendev.org/65458013:14
openstackgerritAndreas Jaeger proposed openstack/openstack-zuul-jobs master: Remove openstack-tox-py35-with-ovsdbapp-master  https://review.opendev.org/65513613:14
AJaegerclarkb: for the git stack, we need to fix the system-config tests first, see above for frickler comment on hashdiff-0.3.9 breaking beaker-trusty. That needs fixing first.13:14
AJaegerclarkb: and good morning!13:15
*** jbadiapa has joined #openstack-infra13:15
clarkbAJaeger: got it and thanks. I'm not quite "here" yet13:15
clarkbthe openstack infra ci helper has our package pins iirc13:16
*** jamesmcarthur has quit IRC13:18
openstackgerritMonty Taylor proposed opendev/system-config master: Set logo height rather than width  https://review.opendev.org/65513913:19
openstackgerritDarragh Bailey (electrofelix) proposed zuul/zuul master: Improve proxy settings support for compose env  https://review.opendev.org/65514013:20
openstackgerritDarragh Bailey (electrofelix) proposed zuul/zuul master: Add some packages for basic python jobs  https://review.opendev.org/65514113:20
openstackgerritDarragh Bailey (electrofelix) proposed zuul/zuul master: Scale nodes up to 4 instances  https://review.opendev.org/65514213:20
mriedemmnaser: i see that http://status.openstack.org/elastic-recheck/#1806912 hits predominantly in vexxhost-sjc1 nodes, any idea if there could be something slowing down g-api startup there?13:21
bodenclarkb okay... should I create a bug or something to track this work, or no?13:21
mordredclarkb: the config.json definitely looks updated on codesearch - want me to just restart the service?13:21
*** lpetrut has joined #openstack-infra13:21
clarkbmordred: probably worth a try13:22
clarkbboden: I'm not sureit is needed. It id a known thing just lower on priority list because worst caselocal grep works13:22
mordredoh - nope. there's another issue13:22
openstackgerritAndreas Jaeger proposed opendev/puppet-openstack_infra_spec_helper master: Block hashdiff 0.3.9  https://review.opendev.org/65514313:23
AJaegerclarkb: is that the proper fix for hashdiff? ^13:23
*** panda|lunch is now known as panda13:23
clarkbAJaeger: I think so13:24
*** redrobot has joined #openstack-infra13:24
openstackgerritMonty Taylor proposed opendev/jeepyb master: Use opendev and https by default  https://review.opendev.org/65514513:24
mnasermriedem: odd.  has it increased recently?  We added IPv6 one or two weeks ago13:24
mordredclarkb, frickler: ^^ that is needed to fix codesearch13:24
mriedemmnaser: what's really weird is the g-api logs show it only taking about 7 seconds for g-api to startup13:25
mriedemhttp://logs.openstack.org/67/648867/8/check/openstacksdk-functional-devstack/e155792/controller/logs/screen-g-api.txt.gz13:25
*** jrist- is now known as jrist13:25
mriedemhttp://logs.openstack.org/67/648867/8/check/openstacksdk-functional-devstack/e155792/controller/logs/devstacklog.txt.gz#_2019-04-23_05_17_22_81113:25
*** sshnaidm is now known as sshnaidm|afk13:26
*** quiquell|rover is now known as quique|rover|lun13:26
*** quique|rover|lun is now known as quique|rover|eat13:26
mriedemlooks like devstack is uploading an image and then waiting to get the image back and maybe it's taking longer in swift?13:27
openstackgerritMonty Taylor proposed opendev/jeepyb master: Use opendev and https by default  https://review.opendev.org/65514513:27
bodenclarkb local grep isn't really an option for this work; we are talking 10s of projects that would need to be up to date to grep across them and whats more we can't share that search with people as its used in the code reviews13:27
mordredboden: yeah. we're working on getting codesearch fixed13:28
mriedemrosmaita: ^ on g-api taking over a minute to 'start' in case you can identify something in those logs13:28
mnasermriedem: is it possible that it is trying to check if its listening on ipv4/ipv6 and the check is verifying the other port?13:28
mriedemrosmaita: http://status.openstack.org/elastic-recheck/#180691213:28
openstackgerritMonty Taylor proposed opendev/jeepyb master: Use opendev and https by default  https://review.opendev.org/65514513:29
rosmaitamriedem: ack13:29
mriedemmnaser: not sure13:29
mnaserthen again we're not the only dual stack cloud13:30
mnaserI think13:30
*** rlandy|ruck is now known as rlandy|ruck|mtg13:30
mriedemthe bug hits other providers13:30
mriedemjust not as much13:30
mriedemi see a GET from g-api to swift here http://logs.openstack.org/67/648867/8/check/openstacksdk-functional-devstack/e155792/controller/logs/screen-g-api.txt.gz#_Apr_23_04_58_17_71376013:30
mriedemthat results in a 40413:31
mriedemwhich is maybe normal devstack checking to see if the image exists (or glance checking) before uploading it to swift?13:31
mordredboden, clarkb: I have restarted hound - it's going to take a couple of minutes because it's got a bunch of new stuff to clone and index13:31
mriedemguessing it's glance checking because then13:31
mriedemApr 23 04:58:17.714742 ubuntu-bionic-vexxhost-sjc1-0005468291 devstack@g-api.service[23037]: INFO glance_store._drivers.swift.store [None req-8c0ac321-01c3-40ac-a9f5-0b733baac629 admin admin] Creating swift container glance13:31
mnaseryeah I saw that too, that seems business as usual13:32
openstackgerritAndreas Jaeger proposed openstack/openstack-zuul-jobs master: Remove openstack-tox-py35-with-neutron-lib-master  https://review.opendev.org/65458013:32
openstackgerritAndreas Jaeger proposed openstack/openstack-zuul-jobs master: Remove openstack-tox-py35-with-ovsdbapp-master  https://review.opendev.org/65513613:32
fricklermriedem: apaches proxies to 127.0.0.1:60998, but g-api is binding to :60999, not sure where these port numbers come from in the first place http://logs.openstack.org/67/648867/8/check/openstacksdk-functional-devstack/e155792/controller/logs/apache_config/glance-wsgi-api_conf.txt.gz13:33
fricklerhttp://logs.openstack.org/67/648867/8/check/openstacksdk-functional-devstack/e155792/controller/logs/screen-g-api.txt.gz#_Apr_23_04_58_11_63530813:33
*** sthussey has joined #openstack-infra13:33
openstackgerritAndreas Jaeger proposed openstack/openstack-zuul-jobs master: Remove openstack-tox-py35-with-neutron-lib-master  https://review.opendev.org/65458013:35
openstackgerritAndreas Jaeger proposed openstack/openstack-zuul-jobs master: Remove openstack-tox-py35-with-ovsdbapp-master  https://review.opendev.org/65513613:35
fricklerfor a working run I see 60999 in both locations13:35
*** bhavikdbavishi has joined #openstack-infra13:35
mriedemhmm yeah and the curl is specifying noproxy13:36
mriedemoh i guess that's just part of wait_for_service in devstack13:36
mordredclarkb: if you get a sec, https://review.opendev.org/#/c/655133/ should fix the nodepool devstack job I thnik13:37
openstackgerritAndreas Jaeger proposed openstack/openstack-zuul-jobs master: Remove openstack-tox-py35-with-ovsdbapp-master  https://review.opendev.org/65513613:39
*** bhavikdbavishi has quit IRC13:39
mriedemfrickler: it looks like the proxy port is random13:40
mriedemhttps://github.com/openstack/devstack/blob/master/lib/apache#L34913:41
mriedemport=$(get_random_port)13:41
fricklermriedem: yeah, so the glance config here also has 60998, not sure why uwsgi chooses 60999 instead http://logs.openstack.org/67/648867/8/check/openstacksdk-functional-devstack/e155792/controller/logs/etc/glance/glance-uwsgi.ini.gz13:42
*** liuyulong has joined #openstack-infra13:43
mnaserquestion: if we have a change that depends-on a review.openstack.org change, will the dependency *not* be taken into consideration or will it 'block' ?13:44
mnaserokay, never mind, it will just ignore it13:44
mnaserZuul took two minutes to pick up a change and bring it to gate so I was starting to wonder what was going on13:45
*** michael-beaver has joined #openstack-infra13:45
*** jamesmcarthur has joined #openstack-infra13:46
*** kranthikirang has joined #openstack-infra13:47
*** quique|rover|eat is now known as quiquell|rover13:47
*** jamesmcarthur has quit IRC13:47
*** smarcet has quit IRC13:48
openstackgerritAndreas Jaeger proposed openstack/openstack-zuul-jobs master: Remove openstack-tox-py35-with-neutron-lib-master  https://review.opendev.org/65458013:49
openstackgerritAndreas Jaeger proposed openstack/openstack-zuul-jobs master: Remove openstack-tox-py35-with-ovsdbapp-master  https://review.opendev.org/65513613:49
pabelangermnaser: yah, depends-on: review.o.o will just be ignored13:50
mriedemfrickler: here is where the random port is retrieved (note ipv4 only) https://github.com/openstack/devstack/blob/master/functions#L80113:51
*** jamesmcarthur has joined #openstack-infra13:52
*** smarcet has joined #openstack-infra13:52
openstackgerritNicolas Hicher proposed openstack/diskimage-builder master: openssh-server: enforce sshd config  https://review.opendev.org/65389013:53
AJaegeramorin: is ovh-bhs1 ok again? We disabled it on the 19th due to network problems, can we use it again?13:54
*** sshnaidm|afk is now known as sshnaidm13:55
*** yamamoto has quit IRC13:56
*** Goneri has joined #openstack-infra13:57
*** psachin has joined #openstack-infra13:58
*** rh-jelabarre has joined #openstack-infra13:59
*** amansi26 has joined #openstack-infra13:59
mnaserinfra-root: either limestone is maybe having issues or rax executors are having network issues, we're seeing a lot of RETRY_LIMIT, jobs failing midway, someone caught one http://paste.openstack.org/show/749635/14:00
*** jaosorior has joined #openstack-infra14:00
mnaserhttp://zuul.openstack.org/builds?result=RETRY_LIMIT14:01
mnaserwe have a lot of RETRY_LIMIT fails14:01
mnaserno logs however..14:01
*** Lucas_Gray has quit IRC14:04
*** mleroy has left #openstack-infra14:06
Shrewspabelanger: any idea what happened here? http://logs.openstack.org/62/654462/3/gate/openstackci-beaker-ubuntu-trusty/e77f31d/job-output.txt.gz#_2019-04-23_13_21_28_86653214:06
*** Lucas_Gray has joined #openstack-infra14:06
mordredboden: http://codesearch.openstack.org/?q=neutron-lib-current&i=nope&files=&repos= works now14:07
pabelangerShrews: I think AJaeger just posted a patch for that14:07
pabelangerhttps://review.opendev.org/655143/ maybe?14:07
Shrewsah14:07
pabelangerAJaeger: looks like we might need to cap bundler too14:08
*** smarcet has quit IRC14:08
fricklermriedem: humm, devstack is being run twice it seems. see the successful end of the first run here http://logs.openstack.org/67/648867/8/check/openstacksdk-functional-devstack/e155792/job-output.txt.gz#_2019-04-23_05_00_55_349794 and then the log of the second run in  http://logs.openstack.org/67/648867/8/check/openstacksdk-functional-devstack/e155792/controller/logs/devstacklog.txt.gz14:10
*** smarcet has joined #openstack-infra14:12
mriedemfrickler: i think clarkb pointed that the last time i looked at this :)14:16
mordredmriedem, frickler: that doesn't seem like a thing we want it doing14:17
quiquell|roverpabelanger: ping14:17
AJaegerpabelanger: yeah - want to take over my change?14:17
quiquell|roverpabelanger: do we have any way to get zuul jobs enqueue time from all the stein cycle ?14:17
quiquell|roverpabelanger: like the enqueued_time json value at zuul status api14:18
corvusinfra-root: am i to understand that there are reports gitea is not answering on ipv6?  is anyone working on that?14:19
mordredcorvus: I believe that wound up being attempted use of git://14:20
AJaegerpabelanger: patching myself quickly14:20
*** eharney has joined #openstack-infra14:20
corvusmordred: ah, ok thanks!14:20
mordredfrickler: want to re-+2 this: https://review.opendev.org/#/c/655145/ ?14:21
openstackgerritAndreas Jaeger proposed opendev/puppet-openstack_infra_spec_helper master: Block hashdiff 0.3.9 and bundler 2.0.1  https://review.opendev.org/65514314:21
corvusmordred: this isn't working that great for me: telnet 2604:e100:3:0:f816:3eff:fe6b:ad62 8014:22
*** electrofelix has quit IRC14:22
pabelangerAJaeger: thanks, can help land it14:24
mordredcorvus: I agree14:24
pabelangercorvus: clarkb: mordred: with gitea, I don't see tags listed on the web interface, am I missing something obvious or is that not a setting enabled?14:24
mordredpabelanger: click the branch dropdown, then click tags14:25
pabelangerquiquell|rover: I think it is logged in statsd otherwise I _think_ it is in db14:25
*** armax has joined #openstack-infra14:26
pabelangermordred: ah, thanks!14:26
pabelangerI was looking for a releases tab like in github14:26
mordredpabelanger: yes - we removed the releases tab becuase it inappropriately provides download links for git export tarballs, which is a terrible idea14:27
mordredand causes people to think that downloading those things and using them might work14:27
corvuswhich, incidentally, is something that github does and we can not prevent14:27
mordredyeah14:27
corvusso github is making its own "releases" of openstack14:28
mordredI wouldn't mind reneabling the page if we could fix it to _only_ show manually uploaded artifacts14:28
mordredsince there is a "create release" and "upload artifact" api14:28
corvusi bet we could make a patch14:28
mordredyeah14:28
mnaserhmm14:29
*** rlandy|ruck|mtg is now known as rlandy|ruck14:30
mnasercorvus, mordred: curl 2604:e100:3:0:f816:3eff:fe6b:ad62:80 returns no route to host, but hitting 8080 gives connection refused14:30
mnaserso.. I don't think this is something we're doing?14:30
corvusyeah, it looks like the LB is only listening on v4: tcp        0      0 0.0.0.0:http            0.0.0.0:*               LISTEN14:31
corvusi was just trying to verify that we restarted it after the config change that should have it listening on v614:31
pabelangermordred: ack, thanks14:32
corvusi'm not 100% sure of that, so i think it's worth a restart14:32
mordredcorvus: ++14:33
quiquell|roverpabelanger: do you know where http://grafana.openstack.org/d/T6vSHcSik/zuul-status?orgId=1 is taking the data from ?14:35
corvusi'm going to restart the container... mostly because we've never tested the "-sf" haproxy option in a container.14:35
corvusthere will be a short interruption in service14:35
corvusdone14:36
corvustelnet to the v6 address works now14:37
pabelangerquiquell|rover: graphite.opendev.org14:37
corvusso that seems to have been the issue14:37
quiquell|roverthanks14:37
*** amansi26 has quit IRC14:39
*** lpetrut has quit IRC14:40
clarkbinfra-root https://review.opendev.org/#/c/655143/2 is the expected fix for system-config jobs which should allow us to merge the git stack size fix for openstack/openstack14:42
*** smarcet has quit IRC14:42
*** nhicher has joined #openstack-infra14:42
quiquell|roverpabelanger, clarkb: do we store queued_time here https://graphite01.opendev.org/ I cannot find it14:44
quiquell|roverenqueued_time I mean14:44
*** smarcet has joined #openstack-infra14:44
openstackgerritPaul Belanger proposed zuul/zuul master: Bump lru_cache size to 10  https://review.opendev.org/65517314:44
*** udesale has quit IRC14:45
*** udesale has joined #openstack-infra14:46
*** ccamacho has quit IRC14:46
*** smarcet has quit IRC14:47
clarkbquiquell|rover: stats.timers.zuul.tenant.zuul.pipeline.check.resident_time.count is an example of that data  Ithink14:48
*** smarcet has joined #openstack-infra14:48
quiquell|roverclarkb: can we filter that per queue name ?14:49
quiquell|roverclarkb: or queue name is not stored ?14:49
clarkbquiquell|rover: 'check' is the pipeline name14:49
*** ramishra has quit IRC14:49
AJaegerclarkb: the infra spec helper fix fails, see http://logs.openstack.org/43/655143/2/check/legacy-puppet-openstack-infra-spec-helper-unit-ubuntu-trusty/8f2dede/job-output.txt.gz#_2019-04-23_14_45_25_116490 - any ideas?14:50
clarkbAJaeger: oh I think this was the thing that cmurphy and mordred were looking at. I think that repo may not be self testing at the moment14:51
clarkbmordred: cmurphy ^ are you able to confirm that? we may have to force merge that change :/14:51
*** amoralej has joined #openstack-infra14:52
mordredclarkb: there's definitely something bonged with those jobs that we should dig in to - I believe last time we just force-merged but I don't have specifics this instant14:54
quiquell|roverclarkb: was looking for something like stats.timers.zuul.tenant.zuul.pipeline.periodic.queue.tripleo.resident_time.count14:54
openstackgerritMerged opendev/jeepyb master: Use opendev and https by default  https://review.opendev.org/65514514:54
clarkbquiquell|rover: I don't think we aggregate by the pipeline queue14:55
quiquell|roverweshay|rover: ^14:55
clarkbmordred: do you have an opinion on whether or not a force merge would be appropriate here?14:56
*** lpetrut has joined #openstack-infra14:56
amoralejis there any known issue with nodes running in rax-ord?14:56
mordredclarkb: I don't - I want to dig in to the construction of that more and see if I can understand all the pieces better - but haven't had time - and now I need to jump on a call for a half hour14:57
clarkbamoralej: mnaser mentioend some problems with job retries. but I don't think anyone has had a chance to debug yet14:57
amoralejack14:57
mordredclarkb: I don't thnik force-merging is likely to break anything any _more_ though14:57
clarkbmordred: ya I'm on a call myself14:57
amoraleji see jobs failing and being retried14:57
clarkbamoralej: ya that was what mnaser described. If a job fails in pre run stage it will be retried up to 3 times14:57
mnaserhttp://zuul.openstack.org/builds?result=RETRY_LIMIT14:58
clarkbI've got a meeting in just a minute but after can start helping to look at it14:58
mnaserit looks pretty wide spread right now but yeah, leaving it for who can look at it..14:58
cmurphyclarkb: this seems like a different issue than what i was worried about14:58
clarkbcapturing what the console log of a job that gets retried would be useful if not already done14:58
amoralejin my case, are failing not in pre, but in run playbook14:58
cmurphythese unit tests are just looking for the spec helper in ../.. so it should still work14:59
openstackgerritMatt Riedemann proposed opendev/elastic-recheck master: Add query for VolumeAttachment lazy load bug 1826000  https://review.opendev.org/65517714:59
openstackbug 1826000 in Cinder "Intermittent 500 error when listing volumes with details and all_tenants=1 during tempest cleanup" [Undecided,Confirmed] https://launchpad.net/bugs/182600014:59
clarkbcmurphy: hrm that change pins bundler to < 2.3.014:59
clarkbcmurphy: so maybe there is another chicken and egg in the testing?14:59
cmurphyclarkb: actually it doesn't in the unit tests http://logs.openstack.org/43/655143/2/check/legacy-puppet-openstack-infra-spec-helper-unit-ubuntu-trusty/8f2dede/job-output.txt.gz#_2019-04-23_14_45_21_61090415:00
AJaegercmurphy: or is my change wrong?15:00
cmurphyAJaeger: you need to edit run_unit_tests.sh too15:00
cmurphyhttps://opendev.org/opendev/puppet-openstack_infra_spec_helper/src/branch/master/run_unit_tests.sh#L4115:00
amoralejclarkb, in some cases nothing at all https://imgur.com/a/kj6C7c515:01
openstackgerritSlawek Kaplonski proposed openstack/project-config master: Switch py35 periodic jobs to py36 in Neutron's dashboard  https://review.opendev.org/65517815:01
openstackgerritNicolas Hicher proposed openstack/diskimage-builder master: openssh-server: enforce sshd config  https://review.opendev.org/65389015:02
AJaegercmurphy: can I just say "gem install bundler < 2.3.0" ? Or what's the syntax?15:02
cmurphyAJaeger: i think with a -v15:03
cmurphyor --version15:03
AJaegerthanks15:04
amoralejclarkb, another one http://paste.openstack.org/show/749648/ this seems failed with unreachable and then remote host identification has changed15:04
amoralejnode redeployed?15:05
clarkbremote host id changing is often due to neutron reusing IPs15:05
clarkb(yay dogfooding)15:05
fungiokay, i think i'm caught up on scrollback in here, so hopefully after my conference call i can help fix some of the new broken15:05
openstackgerritAndreas Jaeger proposed opendev/puppet-openstack_infra_spec_helper master: Block hashdiff 0.3.9 and bundler 2.0.1  https://review.opendev.org/65514315:06
AJaegercmurphy, clarkb, next try ^15:06
amoralejclarkb, i have some consoles where i just see regular messages and suddenly --- END OF STREAM ---15:07
amoralejnot sure if i'm losing messages if i'm not in the console window or something15:07
*** yamamoto has joined #openstack-infra15:07
*** zul has joined #openstack-infra15:08
clarkbamoralej: if the networking is completely broken it won't be able to stream the data off the host anymore15:08
*** ykarel is now known as ykarel|away15:08
*** gyee has joined #openstack-infra15:09
*** yamamoto has quit IRC15:14
*** ccamacho has joined #openstack-infra15:17
clarkbamoralej: looking at message:"REMOTE HOST IDENTIFICATION HAS CHANGED" AND filename:"job-output.txt" in logstash it seems that infra jobs are the biggest problem with that particular error and that is a centos7 specific issue15:17
clarkbotherwise it affects multiple clouds and multiple images15:17
clarkbthe vast majority are on a single zuul executor but also affects multiple zuul executors15:18
clarkbI wonder if it is the executors that are at least part of the problem15:18
clarkbbased on that data we are retrying properly and jobs are eventually rerunning and passing15:20
clarkb(at least in some cases)15:20
fungi"The fingerprint for the ED25519 key sent by the remote host is\nSHA256:..."15:20
fungiumm15:20
clarkbit also peaked a couple hours ago and seems to be tapering off now15:20
clarkbfungi: its a known issue that neutron will reuse IPs in some clouds causing these failures then ARP fights happen15:21
amoralejclarkb, yeah, jobs are running again, let's see how this run goes15:21
clarkbfungi: we could potentially avoid some of that struggle if we were able to ipv6 mroe aggressively15:21
amoralejso far, some jobs are passing or properly failing15:21
amoralejwith no infra issues15:21
clarkbya lets monitor it. The graph data implies it could've been a provider blip that has been corrected15:21
AJaegerdo we want to add ovh-bhs1 back again? I tried pinging amorin here but never got a reply...15:21
*** lpetrut has quit IRC15:22
clarkbAJaeger: I think we can try it and turn it off again if it is still safd15:22
fungithe failures matching the query you provided span all rax regions as well as ovh-gra115:22
AJaegerclarkb: change is https://review.opendev.org/#/c/653879/ - I'll +2 now15:23
clarkbfungi: ya our ipv4 clouds :)15:23
fungithough yeah the biggest volume in the past day was around 13:00 to 13:30 and almost exclusively in ovh-gra115:24
clarkbfungi: and a single infra job15:24
fungiso the rax hits may be rogue vms15:24
fungioh, yep, puppet-beaker-rspec-centos-7-infra for the big spike15:24
fungistrange correlation15:24
fungialso do we still need the puppet-beaker-rspec-centos-7-infra job?15:25
clarkbno I thought I had removed it15:26
clarkb(we should do further cleanup on those as necessary)15:26
fungiso anyway, we assume the key mismatch errors are unrelated to the retry_timeout results15:27
fungidoesn't seem to be any overlap of significance15:27
clarkbthe key mismatches will cause retries15:27
clarkband if you get 3 in a row a retry_limit error15:27
clarkbdepends on whether or not the error happens in pre run15:27
fungithis one reported roughly 50 minutes ago and seems to be a sudden disconnect: http://logs.openstack.org/78/655078/1/check/tacker-functional-devstack-multinode-python3/5ed80d7/job-output.txt.gz#_2019-04-23_14_39_43_44692315:28
AJaegerclarkb, cmurphy , https://review.opendev.org/#/c/655143/ did not work ;(15:29
fungiSetting up conntrackd was the last thing it was doing... will look for a pattern15:29
clarkbfungi: that is suspicious15:29
fungiyeah15:29
clarkbAJaeger: I think 2.0.0 also requires ruby >= 2.3015:30
clarkbAJaeger: see https://rubygems.org/gems/bundler/versions/2.0.015:30
AJaegerI requrest y 2.0.0 now15:30
clarkbAJaeger: so we may want < 2.0.015:30
fungiprobably not conntrackd... this other one died around the same time in the same way but while installing different packages: http://logs.openstack.org/78/655078/1/check/tacker-functional-devstack-multinode/2900b2b/job-output.txt.gz#_2019-04-23_14_39_49_82983215:30
clarkbfungi: that could be a race in buggering15:30
clarkbfungi: the manpages happen just before conntrack in your first log15:30
fungipossible...15:31
clarkb*buffering15:31
fungiyeah, both apparently ;)15:31
openstackgerritAndreas Jaeger proposed opendev/puppet-openstack_infra_spec_helper master: Block hashdiff 0.3.9 and bundler 2.0.1  https://review.opendev.org/65514315:31
clarkbwe only need trusty for a few more days hopefully :/ and then we can remove testing for it as well as centos715:32
fungithis one died while configuring swap: http://logs.openstack.org/95/654995/1/check/designate-pdns4-postgres/21337d4/job-output.txt.gz#_2019-04-23_13_45_34_58501715:32
clarkbfungi: I wonder if those timestamps coincide with the other network issues15:32
clarkblike maybe our executors had broken ipv4 networking during that time15:33
clarkb(or something along those lines)15:33
fungii was starting to have a similar suspicion, maybe network issues in or near rax-dfw?15:33
openstackgerritAndreas Jaeger proposed opendev/puppet-openstack_infra_spec_helper master: Block hashdiff 0.3.9 and bundler 2.0  https://review.opendev.org/65514315:33
clarkbya15:34
fungiis https://rackspace.service-now.com/system_status/ blank for anyone else?15:35
clarkbnot for me15:35
clarkbthere is nothing listed there15:35
clarkbI mean its not blank but no issues posted either15:36
clarkbperhaps was upstream of them and they didn't even notice15:36
fungifirefox just give me a blank page for it. oh well15:36
fungier, gives15:36
AJaegerfungi: works on firefox for me - but takes a bit to load, seems to use some javascript...15:37
*** dustinc_away is now known as dustinc15:37
*** helenafm has quit IRC15:37
fungiyeah, looks from the page source like it's opening a new window or something. hard to tell what exactly... end result is i get no content but also my privacy extensions aren't reporting blocking anything15:38
openstackgerritMerged opendev/elastic-recheck master: Add query for VolumeAttachment lazy load bug 1826000  https://review.opendev.org/65517715:38
openstackbug 1826000 in Cinder "Intermittent 500 error when listing volumes with details and all_tenants=1 during tempest cleanup" [Undecided,Confirmed] https://launchpad.net/bugs/182600015:38
*** jamesdenton has joined #openstack-infra15:38
*** kjackal has quit IRC15:41
*** ccamacho has quit IRC15:46
*** ccamacho has joined #openstack-infra15:47
smcginnisSo we've gotten a lot more failures with ensure-twine being run in a virtualenv.15:49
smcginnisStill not sure where that is coming from.15:49
smcginnisWould it make sense to add a check in https://opendev.org/zuul/zuul-jobs/src/branch/master/roles/ensure-twine/tasks/main.yaml to look for {{ lookup('env', 'VIRTUAL_ENV') }} and drop the --user if set?15:50
*** e0ne has quit IRC15:50
smcginnisProbably also add a debug print of VIRTUAL_ENV so we can figure out where it's coming from too.15:50
szahersts15:51
szahersts15:52
amoralejclarkb, i was tracking 6 different jobs, all failed, i've pasted info about nodes and error messages in http://paste.openstack.org/show/749650/15:53
amoralejthere are different error messages15:53
amoralejalthough it seems all may be network related15:54
*** sshnaidm is now known as sshnaidm|afk15:54
clarkbsmcginnis: you might have to do a python script instead if it isn't part of the env but is instead coming from the executable path. And ya I think adding that debugging info would be great15:54
fricklerboden: fyi, codesearch should be back to work now15:54
clarkbamoralej: were any of them retried?15:55
amoralejretries are in queue15:55
*** ykarel|away has quit IRC15:55
smcginnisclarkb: Ah, I assumed the environment would have been activated since we are just calling "python3" and getting that error.15:55
amoralejwell at least one has reached retry_limit15:55
bodenfrickler great thanks much!  so is the long term plan to use the "explore" from opendev, or to keep the "hound" code search?15:55
amoralejand all of them were retries of previous failed run15:55
clarkbsmcginnis: there are two major ways to venv. One is via env vars. The other is to run python out of a venv directly (maybe our path is being munged?)15:56
mordredboden: I think we need to discuss the long-term plan ... I thnik we'd like to be able to collapse thigns and not need to run both codesearch and gitea ... but the searching in "explore" has some deficiencies at the moment and I think we need to discuss what needs to be or can be done and what the plan will be15:57
weshay|roversorry to bug you guys, are the status from https://review.opendev.org/#/c/616306/15/releasenotes/notes/resource-usage-stats-bfcd6765ef4a9c86.yaml  public or something only avail to infra.. if so I was hoping to represent the stats at the tripleo mtg at ptg15:57
mordredboden: which si to say - I don't think there is yet a full plan - more like a latent desire15:57
* weshay|rover trying to see how tripleo performed in being a good upstream citizen in stein15:57
*** dave-mccowan has joined #openstack-infra15:57
clarkbweshay|rover: they should be in the same graphite server15:58
mordredweshay|rover: yes - those are in graphite15:58
clarkbweshay|rover: but that change is not merged yet15:58
bodenmordred ack and thanks for the info and everyones help on this15:58
mordredboden: sure thing! thanks for the patience, I'm glad we were able to get hound back up and running properly :)15:58
weshay|roverk.. thanks15:59
clarkbweshay|rover: I can do a log parsing run in a bit to give you numbers for the last 30 days15:59
weshay|roverclarkb k.. I know ur busy, it's not critical.. but a nice to have :)15:59
clarkbwell I'm waiting for test results on AJaeger's puppet testing fix so I have a few minutes now :)16:00
clarkbamoralej: my hunch is that we've got ongoing instability in ipv4 networking between our executors and test clouds that ipv416:01
clarkbfungi: ^ any ideas on testing that more directly16:01
clarkbfungi: mtr between ze0* and ovh and rax-iad/ord?16:01
clarkbweshay|rover: http://paste.openstack.org/show/749651/16:02
clarkbamoralej: limestone and vexxhost are talked to via ipv6. Inap is our other ipv4 cloud. If we can find evidence of trouble to vexxhost or limestone we may be able to rule out this theory16:03
amoralejclarkb, that'd make sense, i'm trying to find some pattern in logstash16:04
*** quiquell|rover is now known as quiquell|off16:04
weshay|roverclarkb thanks.. comparing to http://paste.openstack.org/show/736797/  42.6 -> 24.8 not bad :)16:05
corvusclarkb: i started mtrs last week between ze01 and sjc1 v4/v6, rax-ord, rax-iad, rax-dfw, and google dns16:05
*** jpich has quit IRC16:05
clarkbweshay|rover: yup seems to have been steady progress since we started tracking it16:05
corvussjc1v6 had some noticable packet loss, google dns had a very small amount, nothing on the others.16:06
clarkbcorvus: are you doing ipv6 or ipv4 to rax-* ?16:06
corvusto clarify, those mtrs are still running16:06
*** dave-mccowan has quit IRC16:07
*** Lucas_Gray has quit IRC16:07
corvusclarkb: v4 for some reason16:07
weshay|roverclarkb thanks for the help!16:07
clarkbcorvus: I think that is what we want to know for this theory at least. Good to know there isn't any loss there16:08
*** pgaxatte has quit IRC16:09
corvusclarkb: AJaeger was saying we still have bhs1 disabled; mnaser disabled it because of network errors, but then i think we've started to suspect that might have been the same errors we're seeing everywhere?16:09
clarkbcorvus: ya and AJaeger has asked us to reenable bhs1 as a result16:10
clarkblet me find the change16:10
mnaserbut I think at the time clarkb had a vm he was doing tests on16:10
mnaserand it was losing packets or whatnot16:10
clarkbcorvus: https://review.opendev.org/#/c/653879/16:10
clarkbmnaser: yes but could have been related to general network sadness? it was a personal vm in ovh1-bhs1 that had trouble talking to cloudflare dns16:10
clarkbmnaser: those failures could still have been related to the same thing that is making our other traffic unhappy is what I was trying to say16:11
corvusi've added bhs1 to and inap my mtr screen on ze0116:11
*** jbadiapa has quit IRC16:11
mnaserright, but at the time we found proof in unbound that some of these failed jobs failed to contact 1.1.1.1 too in unbound (but of course, that can be all gone now)16:11
clarkbinfra-root https://review.opendev.org/#/c/655143/ passes now and should fix system-cofnig tests allowing us to fix git stack sizes16:12
mnaserbut anyways, yes, there seems to be something weird going on16:12
clarkbif anyone can be second review on that real quick it would be much appreciated16:12
mnaseralso, if someone wants to run mtr from outside rax to sjc1v6 .. in case there's actual issues16:12
corvusdid i see a theory about executor localization?16:12
corvus+316:12
clarkbcorvus: localization like i18n? or physical location of executors playing a part? we did wonder if perhaps the problem was more on the executor side which would explain widespread impact16:13
fungismcginnis: clarkb: i thought we were preinstalling twine on our executors for use in our release jobs. i wonder if the pip install error is due to us failing to actually preinstall it on some executors? or maybe failing to preinstall it for some interpreters?16:13
corvusclarkb: the second thing16:13
clarkbfungi: oh! the move to ansible venvs for zuul would explain that16:13
clarkbfungi: we have a list of things to install into those venvs and I wouldn't be surprised if twine is not on that16:13
smcginnisfungi: THere is a check for `which twine` that fails.16:14
clarkband that may also explain the it's a virtualenv jim problem16:14
corvusclarkb: yeah, if it's widespread, it means one or more of the following: (a) the internet is broken (b) rax-dfw networking is broken (c) networking on one or more exectuors is broken16:14
*** ijw has joined #openstack-infra16:14
corvusi was wondering if someone has correlated failures to suggest (c)16:14
clarkbno I don't think we haev managed to get beyond "it is a theoretical possibility"16:15
smcginnisclarkb, fungi: So should I worry about mucking with ensure-twine, update ensure-twine to not ever do --user, or wait for preinstalling to be fixed?16:15
mnaserI guess we can check if the # of failures per executor is higher16:15
clarkbsmcginnis: we can probably wait for preinstall fix to be fixed if that is an intended feature16:15
*** lucasagomes has quit IRC16:16
amoralejclarkb, looking for "RESULT_UNREACHABLE" message in the last 6hours it shows some high peaks16:16
*** ykarel|away has joined #openstack-infra16:16
clarkbamoralej: about 3 hours ago?16:16
smcginnisclarkb: OK, that's probably easiest for me. Do you think we should actually drop the ensure-twine role completely if there is an expectation that it will always be preinstalled?16:17
amoralejhigher peak is at 15:45 - 15:5016:17
amoralejup to 124 in 5 minutes16:17
openstackgerritPaul Belanger proposed zuul/zuul master: Use user.html_url for github reporter messages  https://review.opendev.org/65518816:17
smcginnisAnd this is blocking some releases, so second question would be who is taking that action and do we have an ETA?16:17
clarkbamoralej: oh good to know16:17
amoralejalso at around 16:4016:17
clarkbsmcginnis: would you like to take the action?16:17
fungismcginnis: the ensure-twine role, i think, is included in the job in the zuul-jobs standard library, for the benefit of folks who don't preinstall twine on executors16:17
amoralejand 14:20 - 14:2516:18
smcginnisclarkb: Not sure if I can take that one.16:18
smcginnisfungi: Sounds like it should probably be fixed then if there's a chance others may use this role in cases where it is not preinstalled.16:18
smcginnisDo I understand right that with a zuul change this will always be run within a venv?16:18
smcginnisIn which case we just need to drop "--user" from the pip install.16:19
fungii believe that behavior will depend on whether the given zuul deployment is configured to manage its own ansible installs, though i could be wrong16:19
smcginnisOK, so we probably do need to make that more robust to be able to handle both cases.16:20
clarkbcorvus: any idea where the list of things to install into the zuul ansible venvs is?16:20
fungiin cases where ansible is not being run from a virtualenv, --user installs presumably work16:20
corvusclarkb: on it16:20
clarkbI'm not having good luck finding it but know that we had to add gear to it semi recently16:20
smcginnisOdd that we had a mix of those working. Maybe due to it being preinstalled some places but not others?16:20
openstackgerritJames E. Blair proposed opendev/puppet-zuul master: Install twine in executor Ansible environments  https://review.opendev.org/65518916:21
corvusclarkb, fungi, smcginnis ^16:21
clarkbtyty16:21
smcginnisThanks corvus16:21
amoralejclarkb, https://imgur.com/a/Xk7Qnk6 in case it helps16:21
corvusdocs here: https://zuul-ci.org/docs/zuul/admin/installation.html#ansible16:21
amoralejthere are failures in ovh and limestone-regionone too16:22
clarkbok so failures in limestone imply that this isn't ipv4 specific16:22
corvusamoralej: can you correlate with zuul_executor and see if there's a pattern?16:22
smcginniscorvus: Does that need to also include the others here: https://opendev.org/zuul/zuul-jobs/src/branch/master/roles/ensure-twine/tasks/main.yaml#L1416:23
corvussmcginnis: yep16:23
*** kopecmartin is now known as kopecmartin|off16:23
*** mrhillsman is now known as openlab16:23
*** openlab is now known as mrhillsman16:23
corvussmcginnis, clarkb, fungi: how did we used to have twine pre-installed?16:24
clarkbcorvus: it is/was in system-config/manifests/site.pp on the zuul scheduler16:24
amoralejcorvus, seems disperse16:24
clarkber executor16:24
*** mrhillsman is now known as openlab16:24
corvusclarkb: 'git grep twine' in system config is nil16:24
clarkbhrm no that is just gear ok I'm wrong16:24
corvusamoralej: thanks16:24
*** openlab is now known as mrhillsman16:25
fungiugh, firefox is back to showing a "corrupted content error" about network protocol violations when trying to browse opendev.org16:25
corvusperhaps we did not pre-install?16:25
clarkbcorvus: ya that could be but python wasn't a venv and so --user worked16:25
corvusfungi: i restarted the haproxy ~2hours ago16:25
fungiahh, maybe that's it16:25
clarkbso maybe the proper fix here is to simply fix it in the job / role16:25
fungicorvus: how could we have been using twine on the executors if it wasn't preinstalled? what options did we have for installing software on executors? is that allowed in bubblewrap?16:26
clarkbfungi: it was running the pip install16:26
clarkb(or could have run the pip install that it is running now)16:26
amoralejcorvus, executors go from ze01..ze13 ?16:27
amoralej12 i meant16:27
corvuson ze01, python3 -e "import twine" -> ImportError: No module named 'twine'16:27
corvusamoralej: yes16:27
fungiso pip install --user was working under bwrap-managed homedirs i guess?16:27
clarkbfungi: yes that is my hunch16:27
corvusi WIP'd https://review.opendev.org/65518916:28
fungii recall having discussions about needing tools we're going to run on executors preinstalled for security reasons, but i concur i don't see it in the puppet-zuul git history16:28
*** smarcet has quit IRC16:28
*** dtantsur is now known as dtantsur|afk16:28
clarkbit is also possible that pip being mad about --user from in a venv is new16:29
corvuswe certainly need system packages pre-installed, but installing a python package from our mirror should be 99.99999% reliable16:29
corvusand permitted by bwrap (at least when run in a trusted playbook)16:29
*** smarcet has joined #openstack-infra16:30
fungiwhich also brings me back to wondering why this was only failing sporadically on some builds and then worked when we reenqueued them, but has now started to fail consistently16:30
corvusthat's an improvement16:30
*** whoami-rajat has quit IRC16:30
fungiover the course of a week or so16:31
clarkbfungi: image updates?16:31
clarkbno we run on the executor16:31
amoralejcorvus, http://paste.openstack.org/show/749655/ note a single failed job can have more that one RESULT_UNREACHABLE message, i'm trying to clean it more16:32
fungiso analyzing a recent release ensure-twine failure of this nature: http://logs.openstack.org/02/02dc0019af5f47d1850781b83e6041201054e1c5/release/release-openstack-python/9e49a6f/job-output.txt.gz#_2019-04-22_21_30_35_95093416:34
clarkbcorvus: looking at ze01 there are not very manyssh connections (`sudo lsof -n -i TCP`) and I can't hit one of the hosts that is SYN_SENT and not ESTABLISHED from home16:34
clarkbcorvus: implying that the host is actually not reachable16:34
clarkbI'm going to boot a testnode or three in rax iad16:35
clarkbout of band of nodepool and see if they are reachable from the executors and home16:36
corvusthe jobs are a mix of devstack and non-devstack (eg, osa), right?16:36
corvus(the unreachable node jobs)16:36
mnaseryep ^16:36
fungicorvus: yes, and tox stuff too from what i saw16:36
*** rcernin has quit IRC16:36
mnaserand I have some OSA failures that just failed in a super random spot (so nothing having to do with network related operations)16:36
fungia lot of jobs just terminate partway through (at different points) and declare ssh unreachable and the console stream prematurely ending16:37
*** bobh has joined #openstack-infra16:37
AJaegerfungi: care to change your WIP to +2A on https://review.opendev.org/#/c/653018/ to give us imap back? See last comment there...16:37
fungidone16:38
AJaegerthanks16:38
clarkbfungi: yup and looking at executor tcp connections there aren't a ton of ssh connections16:39
*** whoami-rajat has joined #openstack-infra16:39
fungiokay, so on the ensure-twine problem... i have to assume that the virtialenv it's talking about in the error is the one zuul is managing for ansible... if so, that's going to be outside bwrap and mapped in read-only so a regular pip install without --user won't work, right?16:40
clarkbfungi: yes16:40
amoralejclarkb, it seems there has been another peak right 10 minutes ago16:43
clarkbamoralej: I wonder if that has to do with job runtimes and our use of ssh control persist16:43
*** ginopc has quit IRC16:44
fungimost recent pip release was over a month ago, most recent virtualenv release was nearly a month ago, so those don't really seem to line up with when we started seeing the ensure-twine failures16:45
clarkbfungi: virtualenv updates its pip on install now. possible it is just pip and not virtualenv?16:45
*** mattw4 has joined #openstack-infra16:45
*** rossella_s has quit IRC16:45
fungiwell, neither released new versions around the time we started to see the problem, which was after the stein release16:46
*** tosky has quit IRC16:46
*** udesale has quit IRC16:47
clarkbI have 3 hosts up in rax-iad now all show zero packet loss to various executors. They all ended up in the same /24 though (so if network range problem this may not expose it)16:48
*** eharney has quit IRC16:48
amoralejclarkb, for the reviews i've been closely monitoring, failing jobs are long running ones, but it's hard to say if it's related to long jobs or just it's more likely that you hit network issues in long jobs16:49
clarkbamoralej: ya we use ssh control persistence to reduce the number of connections that have to made too. If there is network trouble it is possible that longer jobs are more likely to run into it given the mitigations we already have16:50
openstackgerritMerged openstack/project-config master: Revert "Temporarily disable inap-mtl01 for maintenance"  https://review.opendev.org/65301816:50
*** altlogbot_2 has quit IRC16:50
mordredclarkb, amoralej: I have anecdotally observed the same thing - the issue seems to be most seen while running a long-running single shell task16:50
mordredbut the plural of anecdote isn't data - so I don't know if that's a real thing or just what I happen to have observed16:51
amoralejmordred, yes, that's also my case16:51
openstackgerritJason Lee proposed opendev/storyboard master: WIP: Second correction to Loader in preparation for Writer Update  https://review.opendev.org/65481216:52
*** jpena is now known as jpena|off16:52
fungiwell, also longer-running jobs are simply statistically more likely to fall victim to a random network problem16:52
mordredfungi: yes, this is a very accurate statement16:52
clarkbalso nodepool checks ssh connectivity before giving the node to a job16:52
mordredand within those jobs, long-running single tasks are statistically more likely to be the thing that hits it16:52
fungiyep16:53
clarkbso we know that networking works well enough for that to be successful before zuul gets the node. I am going to let my test nodes sit around a for a bit as a result and see if they look worse in an hour16:53
clarkbwe may also want to sanity check nodepool isn't getting duplicate IPs16:53
clarkbmaybe we are our own noisy neighbor type situation16:53
*** altlogbot_3 has joined #openstack-infra16:55
clarkbas a spot check we do recycle ip addrs but they are not during overlapping time periods (if arp was not updating properly we might see this behavior)16:56
fungii guess our job logs don't actually say where the ansible they're running is installed on the executor? at least i can't seem to find that information. also the docs for zuul-manage-ansible don't say how it installs ansible... the versioned trees under /var/lib/zuul/ansible/ on our executors don't look like virtualenvs either16:56
*** derekh has quit IRC16:57
clarkbI need to step out for breakfast I'll be back in a bit to look into this networking stuff more16:58
fungidigging into the AnsibleManager class definition now16:58
fungiaha, https://zuul-ci.org/docs/zuul/admin/components.html#attr-executor.ansible_root17:00
corvusfungi: the debug log says it's running /usr/lib/zuul/ansible/2.7/bin/ansible-playbook17:00
fungioh, i bet that bindir is somehow mapped into the bwrap context17:02
fungino, nevermind17:02
fungi/usr/lib not /var/lib17:02
corvushere's a full example command: http://paste.openstack.org/show/749657/17:02
fungii guess <zuul_install_dir> is /usr in our case17:03
corvusfungi: i think the ansible venv is in /usr/lib and the zuul modules (which are also versioned) are in /var/lib17:03
fungifhs tunnel vision, i don't typically expect anything besides the system package manager to add things in /usr, thanks!17:04
fungii kept looking at that and my brain was automatically substituting /var17:04
corvusyeah, maybe we should change that17:05
funginot super critical, just me with distro-oriented blinders on17:05
*** jbadiapa has joined #openstack-infra17:05
fungiwas trying to run this down from first principles and validate our assumptions about how/where it's trying to install twine17:06
fungiso on ze01 the ansible venvs are all using python 3.5.2 and pip 19.0.317:08
fungilatest version from february 2017:09
fungiahh, right, i can't calendar. the latest pip and virtualenv versions are from two months ago, not a month ago, so even less correlated to the start of these failures17:09
fungiwe're now in april17:10
*** nicolasbock has quit IRC17:10
fungiso looks like the ansible venvs on ze01 were created on march 18, still much longer ago than ensure-twine started popping this error17:10
fungisame creation timestamp on all 12 executors17:12
fungiand definitely no twine installed in any of them right now17:14
*** gagehugo has joined #openstack-infra17:15
fungino twine executable in the default system path for any of the executors either17:15
fungialso as previously established, the last change in git for the ensure-twine role merged january 2917:17
*** ijw has quit IRC17:18
*** ijw has joined #openstack-infra17:19
*** ijw has quit IRC17:20
*** e0ne has joined #openstack-infra17:20
*** ijw has joined #openstack-infra17:20
fungiwe're calling ensure-twine from the release-python playbook in opendev/base-jobs which was fixed to add that role on april 317:20
*** rpittau is now known as rpittau|afk17:20
*** Weifan has joined #openstack-infra17:21
fungimy notes say the first recorded case of this particular failure was 2019-04-17 in http://logs.openstack.org/19/19a7574237f44807b16c37e0983223ff57340ba3/release/release-openstack-python/769f856/17:22
*** Weifan has quit IRC17:22
fungiso roughly 6 days ago17:23
*** Weifan has joined #openstack-infra17:23
*** ijw has quit IRC17:23
openstackgerritPaul Belanger proposed zuul/zuul master: Add retries to getPullReviews() with github  https://review.opendev.org/65520417:23
clarkbI can still ssh into my three test nodes in iad after leaving them be for a while17:23
*** ijw has joined #openstack-infra17:23
*** e0ne has quit IRC17:24
fungiclarkb: i wonder, can you start up a nc on both ends and connect them to each other with no traffic for a while, then see if they get disconnected (or simply stop passing traffic)?17:24
clarkbcould be worth testing. I'm rotating out one of the three to see if a new one immedaitely after ad elete has any interesting behavior (since that is what ndoepool does)17:25
fungiyeah, wondering if the failures we see couldn't be some stateful network device losing its sh^Htates17:26
fungior aggressively dropping inactive ones17:27
clarkbya17:27
clarkb(at some point I really need to start putting together a project update and figuring out a summit schedule)17:27
clarkb(so please don't assume I should be the only one debugging this stuff :) )17:27
fungialso, one thing which can cause this... packet shapers. i'm going to look and see if there is rate limiting evidence in the cacti graphs for our executors17:28
clarkb++ thanks17:28
*** psachin has quit IRC17:28
clarkboh and we have a meeting in an hour and a half and the ptg to plan for17:28
fungimeh, "priorities" ;)17:28
*** diablo_rojo has joined #openstack-infra17:29
clarkbfungi: this is where you say "baord meetings are for writing project updates" ?17:29
clarkbbtw I also checked dmesg on ze01 for any evidence of say OOMKiller and found nothing17:30
clarkband syslog lacks complaints from ssh17:30
*** jamesmcarthur has quit IRC17:30
corvusi'm working on fixing our local patch to gitea17:30
fungicacti says ze01 is running pretty tight on available memory, but i suppose that's our ram governor at work. the others are almost certainly similar17:30
*** jamesmcarthur has joined #openstack-infra17:31
corvusit's not easy because every time i try to run the unit tests, my printer start spewing garbage17:31
fungihah17:31
clarkbcorvus: at least it isn't on fire?17:31
fungii hope you don't run out of greenbar17:31
corvusyeah.. maybe don't print the randomly generated binary test data to stdout?17:32
fungiyeesh17:32
clarkbwhat are teh chances this is an ssh/ansible issue?17:32
clarkb(just wondering if we need to explore that too)17:33
fungicacti seems to only occasionally be able to reach ze02, but this doesn't look like new behavior: http://cacti.openstack.org/cacti/graph.php?action=view&local_graph_id=64158&rra_id=all17:33
corvusthere have been ansible releases and we should be auto-upgrading them17:33
clarkb2.7.latest is what we should be using right?17:33
clarkbby default at least17:33
clarkb2.7 updated on april 20 according to timestamps on files17:34
clarkbbut last 2.7 release was april 417:35
*** jamesmcarthur has quit IRC17:35
clarkbwe did manage to emrge an openstack manuals chagne so maybe not all is lost :)17:36
fungioh, that was just the documentation update which says "all is lost"17:37
fungii'm up to ze06 so far... no obvious signs of network rate-limiting or anything else especially anomalous on the graphs which might coincide with these ssh problems17:38
fungicacti seems to have collected no data whatsoever on ze1017:41
clarkbat this point I've sent several thousand mtr tracer pings and not a single one was lost between iad and 3 executors over ipv417:42
fungiso aside from the fact that cacti can't reliably reach ze02 and ze10 over ipv6 (other servers where we saw this, i think deleting and recreating the network port got it working?) i see nothing on the cacti graphs for our executors which would explain the ssh connection issues17:44
*** ccamacho has quit IRC17:54
clarkbI've just ssh'd into every ipv4 host connected to from ze0117:54
clarkband they all worked17:54
*** bobh has quit IRC17:55
openstackgerritMerged opendev/puppet-openstack_infra_spec_helper master: Block hashdiff 0.3.9 and bundler 2.0  https://review.opendev.org/65514317:55
jrosserfwiw the log I grabbed here was an ipv6 fail http://paste.openstack.org/show/749635/17:55
clarkbwoo I can recheck the gitea git stack fix now17:55
fungii wonder, should we set some autoholds and then when one catches a node which fell victim to this behavior check connectivity and/or nova console?17:55
clarkbjrosser: thanks. In this case I didn't check ipv6 because no ipv6 at home17:56
*** amoralej is now known as amoralej|off17:56
clarkbbut ya I think we amoralej dig up some limestone failures taht were similar and also ipv617:56
clarkbfungi: ya that might be more efficient than me trying to manually boot one that fails17:56
fungii wonder how broad of an autohold i can add17:56
clarkbI'm going to delete my three test instances in iad now since they haven't shown me anything useful17:58
clarkbcorvus: the docker registry backed by swift change merged yesterday iirc18:00
clarkbcorvus: is that something we hsould followup and check on?18:00
corvusclarkb: i suspect we will need to restart it to make it go into effect, and it would be good to do that in conjuctions with watching some jobs18:02
mordredcorvus, clarkb: I can take that - y'all seem to be having fun diagnosing network issues :)18:05
corvusmordred: well, i'm mostly working on the gitea change18:05
corvusbut yes, also "fun"18:05
mordredcorvus, clarkb: I can take that - y'all seem to be having fun diagnosing network issues and that gitea change :)18:05
clarkbmordred: thanks18:05
clarkbI need to switch gears into prepping for the meeting un just a moment so not sure how much network debugging I'll be doing for a bit18:06
mnasermaybe a little silly but perhaps someone shooting off an email to rax about if there is any network changes might be productive18:06
mnasermaybe there's some firewall or network appliance that was recently setup which affects our type of workloads18:06
clarkbcloudnull: ^ you about?18:07
mnaserjust an extra useful datapoint18:07
corvusi *finally* have gotten the gitea sqlite integration tests (the ones that are failing on my pr) to pass and run on master.  it turns out the procedure for running them is not the one in the docs, or in integrations/README.md, but rather, is only documented in the drone.yml config file.18:11
openstackgerritJeremy Stanley proposed opendev/system-config master: Blackhole spam for airship-discuss-owner address  https://review.opendev.org/65522718:12
clarkbhttp://paste.openstack.org/show/749660/ is a specific example I dug up ansible logs for18:16
clarkbany idea if the complaint about the inventory not being in the desired format is potentially related?18:17
clarkblike maybe we cannot connect becuse we broke the inventory somehow?18:17
*** ricolin has quit IRC18:17
clarkbthat was an ipv6 host in sjc1 too fwiw18:19
openstackgerritMonty Taylor proposed opendev/base-jobs master: Update opendev intermediate registry secret  https://review.opendev.org/65522818:19
clarkbI think the next steps are fungi's hold idea and filing a ticket/sending email to rax. I've got to pop out and do some stuff before the meeting but I guess we'll pick back up there18:20
mordredcorvus: ^^ we need to do that per-tenant, right?18:20
fungiyeah, i'm just to the point of fiddling with autohold now18:20
mordredcorvus: or just the once in opendev/base-jobs is fine18:20
fungii don't suppose there are any particular projects/jobs/changes which will be better choices for an autohold than others18:21
clarkbfungi: jobs that take longer to run18:21
mordredclarkb: https://review.opendev.org/655228 is needed for the registry stuff - gotta rekey on the client side too18:21
clarkbmaybe tempest-full, a tripleo job, and an OSA job?18:21
clarkbmordred: k18:21
fungii suppose i can push up some trivial dnm changes and set autoholds for some long-running jobs which will run against them18:22
clarkbalso we may not hold on network failures?18:22
clarkbthe example above that I pasted is being rerun aiui18:22
clarkbbecause ansbile reported it as a network failure to zuul18:23
fungioh, yeah at best it'll hold the last build which ends in retry_limit i guess?18:23
clarkbya18:24
clarkbassuming it actually gets there and doesn't magically work on the third attempt18:24
openstackgerritNicolas Hicher proposed openstack/diskimage-builder master: openssh-server: enforce sshd config  https://review.opendev.org/65389018:24
clarkbok really need to pop out for a bit now. Back soon18:24
mordredcorvus, fungi: if you get a sec, https://review.opendev.org/#/c/655228/18:26
*** e0ne has joined #openstack-infra18:29
*** nicolasbock has joined #openstack-infra18:31
*** eharney has joined #openstack-infra18:39
clarkbmnaser: bce3129d-8458-4947-b567-2c41311aab6a is the nova uuid of the node above that failed in sjc1. Might be worth sanity checking it to make sure that it didn't crash qemu/kvm (perhaps related to image updates or something)18:42
*** jamesmcarthur has joined #openstack-infra18:44
*** jamesmcarthur_ has joined #openstack-infra18:45
*** happyhemant has quit IRC18:48
*** jamesmcarthur has quit IRC18:49
*** smarcet has quit IRC18:53
corvusfungi, clarkb: yeah, you can autohold all failures if you want.  retry_limit will trigger autohold, but only on the last.18:56
corvusthat doesn't seem to be a problem though.18:56
corvusthere's a change in review to allow you to specify the result states for an autohold. :/18:56
clarkbcorvus: I think the issue is we'll only be able to hold it if it gets to retry_limit18:56
clarkbwhich is maybe good enough18:56
corvusclarkb: right.  but plenty of jobs are doing that :)18:56
*** ykarel|away has quit IRC18:57
fungicool, these are the autoholds i added: http://paste.openstack.org/show/749665/18:57
corvuswe will also get the network failures that happen in run and post-run playbooks, though those will be harder to triage out from regular failures18:57
fungiwill check periodically to see what we catch in the trap and throw back any which aren't keepers18:57
fungiit's eerily like crabbing18:58
clarkbespecially when you throw back 99% of what you get18:58
*** Weifan has quit IRC18:58
fungiyeah18:58
clarkbinfra meeting in a few minutes18:58
mordredfungi: i don't think the keepers in this case will be very tasty18:58
clarkbjoin us in #openstack-meeting18:58
corvusyou only want jobs larger than a certain size18:59
fungithough it looks like my third zuul autohold command isn't returning control to the terminal18:59
fungiand it's been a few minutes18:59
fungimaybe reconfigure in progress18:59
corvusfungi: yeah, be patient18:59
corvuswell.. hcm19:00
fungi"crabbing suspended: ocean reloading, please wait"19:00
corvusit... could be that the scheduler is out of ram19:00
fungioof19:00
corvusapparently there is a memory leak as of our april 16 restart19:01
*** e0ne has quit IRC19:01
mwhahahaany particular reason why centos7 jobs are RETRY_LIMITing?19:02
mwhahahasee https://review.opendev.org/#/c/654648/ (only centos7 jobs did)19:02
corvusi will restart scheduler now19:02
clarkbmwhahaha: it is affecting all jobs19:02
clarkbmwhahaha: we've been trying to sort it out for much of the morning19:02
mwhahahak19:02
corvushelp appreciated19:02
mwhahahait didn't seem to affect the bionic jobs on that one19:03
mwhahahaweird19:03
corvusclarkb, fungi: what do you think about taking the opportunity to restart all executors?19:03
fungiwow, yeah we've been swapping on the scheduler since ~12:30z today19:03
fungicorvus: seems like a good idea19:03
clarkbcorvus: maybe even reboot them if it is possible ssh issues are related to the system19:03
fungiyes, exactly my thoughs19:03
corvusthey have been running for a long time -- just incase some cruft has accumulated19:03
fungithoughts19:04
fungishould we use the full restart playbook for this?19:04
corvusfungi: almost -- if we want to reboot that's an extra step19:05
corvusso i'll just do it manually19:05
fungiahh, yeah i missed clarkb's use of the word "reboot"19:05
fungistatus notice the zuul scheduler is being restarted now in order to address a memory utilization problem; changes under test will be reenqueued automatically19:07
fungithat look sufficient?19:07
corvusfungi: ++19:07
clarkbfungi: ++19:07
fungi#status notice the zuul scheduler is being restarted now in order to address a memory utilization problem; changes under test will be reenqueued automatically19:07
openstackstatusfungi: sending notice19:07
openstackgerritNate Johnston proposed openstack/project-config master: Track neutron uwsgi jobs move to check queue  https://review.opendev.org/65523419:08
-openstackstatus- NOTICE: the zuul scheduler is being restarted now in order to address a memory utilization problem; changes under test will be reenqueued automatically19:08
openstackstatusfungi: finished sending notice19:10
corvusstill waiting on execs to stop19:12
corvusall stopped19:14
corvusi will reboot all mergers and executors19:14
fungithanks19:15
*** kjackal has joined #openstack-infra19:16
smcginnisclarkb, fungi, corvus: Sorry, I had to step out for awhile. Do we still need an update to ensure-twine to check whether to install with --user or not? I saw a comment that I think was saying it might not help, but I wasn't really sure.19:18
corvusexecutors and mergers are up and running19:19
corvusrestarting sched now19:19
fungismcginnis: indeterminate. i went back to the drawing board trying to confirm how things could have been working previously and working up a timeline of what we know changed when19:21
fungibecause it's still baffling19:21
smcginnisYeah, very baffling.19:22
smcginnisI would feel much better if we understood what changed that caused this. That first one that worked after a reenqueue was odd.19:22
corvuser, neat.  something killed the scheduler19:24
mordredcorvus: "awesome"19:24
Shrewswow19:26
corvuswe've seen this once before, also never found out what it was19:26
corvustrying again19:26
*** nicolasbock has quit IRC19:27
*** nicolasbock has joined #openstack-infra19:27
*** wehde has joined #openstack-infra19:29
wehdeCan anyone help me figure out a neutron issue?19:29
*** igordc has joined #openstack-infra19:29
*** jamesmcarthur_ has quit IRC19:32
corvusloaded19:32
corvusre-enqueueing19:32
corvus#status log restarted all of Zuul at commit 6afa22c9949bbe769de8e54fd27bc0aad14298bc  due to memory leak19:32
openstackstatuscorvus: finished logging19:32
*** Weifan has joined #openstack-infra19:34
*** smarcet has joined #openstack-infra19:37
openstackgerritMonty Taylor proposed opendev/system-config master: Use internal gitweb instead of gitea for now  https://review.opendev.org/65523819:37
*** Weifan has quit IRC19:39
paladoxAh you use logging too :) (though your bot appears to have "status" too).19:41
*** kjackal has quit IRC19:42
fungiand we have it send notices to numerous irc channels, and in extreme cases also update channel topics about ongoing situations19:42
fungiand all the entries get recorded at https://wiki.openstack.org/wiki/Infrastructure_Status (for the moment anyway)19:42
paladoxThat's nice! (that would be useful)19:43
paladoxour's get's logged to multiple places19:43
fungii think ours also tries to tweet thnigs, but i'm not sure where since i'm not really into social media19:43
paladoxheh (i know ours does :))19:43
fungithere was talk of an rss/atom feed as well19:43
paladoxsomeone logged spam.19:43
mordredfungi, paladox: https://twitter.com/openstackinfra19:44
fungiahh, there it is19:45
paladoxThe bot can log here https://wikitech.wikimedia.org/wiki/Nova_Resource:<project>/SAL (if it's a WMCS project) otherwise things get logged here https://wikitech.wikimedia.org/wiki/Server_Admin_Log19:45
paladoxah19:45
fungicorvus: clarkb: should i readd my earlier autoholds, or do we want to just watch the system for a bit first and see if the problem resurfaces?19:47
mordredsmarcet: o hai. I told summit.openstack.org to sync with my google calendar, but I dont have any summit sessions on my calendar. you have all the magical fixing powers right?19:47
fungiworst bug report evar19:48
corvusfungi: good q, and i'm too hungry to come up with an answer19:48
clarkbif we can remove autoholds after the fact without them triggering I say add them19:49
clarkbotherwise maybe watch and see19:49
fungiyeah, about to find food as soon as the infra meeting is over19:49
fungiclarkb: oh, the scheduler restart took care of removing the autoholds for us, which is why i asked ;)19:49
clarkbah19:49
paladoxfungi this is our's https://twitter.com/wikimediatech19:50
openstackgerritSean McGinnis proposed zuul/zuul-jobs master: ensure-twine: Don't install --user if running in venv  https://review.opendev.org/65524119:56
smcginnisfungi, clarkb, corvus: Newbie yet, so would appreciate feedback on that approach. ^19:56
clarkbcorvus: want to direct enqueue 654634?19:57
openstackgerritSean McGinnis proposed zuul/zuul-jobs master: ensure-twine: Don't install --user if running in venv  https://review.opendev.org/65524119:57
* clarkb smells the curry that was made for lunch and wanders downstairs19:58
fungii have shrimp risotto to get to19:58
fungismcginnis: i *suspect* the problem for us is going to be that the virtualenv from which ansible is run is read-only for the jobs, so they're not going to be able to pip install anything into it19:59
mordredsmarcet: nevermind. I don't know how to calendar apparently20:00
smarcetmordred: actually there are 2 ways of doing it20:01
corvusfungi, smcginnis: that's an avenue to explore, however, i'm not sure the ansible virtualenv will be "activated" for anything other than the ansible process...20:01
smarcetmordred: allow oauth2 permission to your calendar using the synching button and choose google as provider20:02
fungismcginnis: probably we either need to have ansible invoke pip install --user under the system python (not exactly sure what the complexities of that are) or have it create a local virtualenv ni the workspace and pip install twine into that20:02
smcginniscorvus: I believe that will still pick up if we are running via a virtualenv python, even if the whole environment is not activated.20:02
smarcetmordred: or you could use the brand new option “20:03
smarcetGET SHAREABLE CALENDAR LINK20:03
smarcet20:03
corvussmcginnis: ansible is being run in a virtuaenv -- what ansible then runs is an open question20:03
smarcetmordred: from page https://www.openstack.org/summit/denver-2019/summit-schedule20:03
smcginnisBased on the failure, it would appear what gets run at least picks up that python executable.20:03
corvussmcginnis: do you have a link that shows that?20:03
smcginnisfungi: If that's true, we should drop the ensure-twine role completely as it likely will never work right.20:03
corvusall i've seen is the opaque error about user in a virtualenv20:04
smcginniscorvus: That was the cojecture^whypothesis easrlier as to why it is failing the pip install. It's not running in a virtualenv itself.20:04
corvusright, i'm saying i have doubts about that hypothesis and we should attempt to prove or disprove it rather than assume it is correct20:05
diablo_rojoclarkb, sorry I got distracted during the meeting, nothing new with storyboard-- still have a lot of patches to review. Planning a huge story triage/overhaul at the PTG Thursday morning. Thats about it.20:05
fungiagreed, it's also possible pip is confused and thinks it's being run from a virtualenv when it isn't20:05
pabelangercorvus: smcginnis: I believe, if ansible is using localhost, it will look to be inside a virtualenv for playbook task, however, if it uses ssh via localhost, it won't.20:06
smcginnisAll I know is, releases are completely blocked until this issue is resolved.20:06
corvusanyway, i think the next step is for someone to write a job which exercises this stuff and gets some debug output20:07
mordred++20:07
*** Weifan has joined #openstack-infra20:07
*** jamesmcarthur has joined #openstack-infra20:08
*** igordc has quit IRC20:10
fungiit's entirely possible, since "twine_python" is a variable we're passing to the role, that its value started being set to 'python3' recently and that's what started triggering this behavior? looking at the json, it's running this command: `python3 -m pip install twine!=1.12.0 readme_renderer[md]!=23.0 requests-toolbelt!=0.9.0 --user`20:11
fungithe default declared for twine_python in the role is just "python" not "python3"20:12
*** Weifan has quit IRC20:12
*** jamesmcarthur has quit IRC20:12
fungithe release-python playbook in opendev/base-jobs doesn't set it20:13
funginor does the opendev-release-python job in the same repo20:14
*** Weifan has joined #openstack-infra20:14
*** igordc has joined #openstack-infra20:14
*** igordc has quit IRC20:15
*** Lucas_Gray has joined #openstack-infra20:16
fungiaha, release-openstack-python as defined in openstack/project-config sets it according to http://zuul.opendev.org/t/openstack/job/release-openstack-python20:16
fungihttps://opendev.org/openstack/project-config/src/branch/master/zuul.d/jobs.yaml#L112-L12920:18
fungiits path to the ensure-twine role, for the record, is via https://opendev.org/openstack/project-config/src/branch/master/playbooks/publish/pypi.yaml20:19
clarkbfwiw the last ansible exitcode 4 was at 19:08UTC on ze0120:22
fungigit blame suggests the twine_python variable was set in that job as of november when https://review.opendev.org/616676 merged20:22
fungiso that's not what has caused this20:23
smcginnisSince that first solum patch failed, but then worked after a reenqueue, is there something changed in the nodes that could explain why one (assumingly newer) node would fail, but another would work as it has been until now? And as more nodes were updated the failure became more prevalent?20:24
fungithere are no nodes in this case, the ensure-twine role is running on the executor ("localhost" in the inventory)20:26
smcginnisExecutors updated?20:26
fungiso been trying to figure out what could have changed on the executors on or around the 17th20:26
*** pcaruana has quit IRC20:32
openstackgerritMerged opendev/system-config master: Double stack size on gitea  https://review.opendev.org/65463420:33
clarkbcorvus: ^ finally20:34
*** kgiusti has left #openstack-infra20:34
clarkbI think we are about half an hour from that applying20:34
*** jamesmcarthur has joined #openstack-infra20:39
*** jamesmcarthur has quit IRC20:46
mordredwoot20:50
mordredclarkb, fungi: https://review.opendev.org/#/c/655238/20:55
clarkbhrm seems like we may still have some ssh failures just not as many of them?20:55
clarkbmordred: left a comment20:56
clarkbze01 has three occurences of ansible exit code 4 in the last few minutes20:57
*** jamesmcarthur has joined #openstack-infra20:57
*** Goneri has quit IRC20:59
clarkbfungi: ^ if you haven't set up the autohold yet you may want to20:59
corvusgitea has an "archived" setting for repos20:59
*** andreykurilin has joined #openstack-infra21:00
diablo_rojoclarkb, fungi Monday is marketplace mixer, Tuesday is Trillio Community Party, Thursday is game night, Friday is PTG happy hour. So Monday after the mixer would work for the Lowry21:00
clarkbdiablo_rojo: ya that is what I'm thinking and the current weather forecast should be reasonable for that21:01
*** jamesmcarthur has quit IRC21:01
diablo_rojoI'd be game for that.21:01
openstackgerritMonty Taylor proposed opendev/system-config master: Use internal gitweb instead of gitea for now  https://review.opendev.org/65523821:03
mordredclarkb: ^^ does that look better?21:03
mordredcorvus: neat. that seems like a thing we shoudl make use of when appropriate21:03
fungiclarkb: i've readded the previous autoholds21:03
mordredclarkb: do you know of any planned GoT viewing parties Sunday evening?21:04
clarkbmordred: it does: <% if  scope.lookupvar("gerrit::web_repo_url") -%>21:04
clarkbmordred: which may still fire on the ''21:04
mordredclarkb: what's the right way to set that so that it doesn't? false?21:04
clarkbmordred: yes I think that is the right way to manipulate the ruby21:05
openstackgerritMonty Taylor proposed opendev/system-config master: Use internal gitweb instead of gitea for now  https://review.opendev.org/65523821:05
openstackgerritMerged opendev/storyboard-webclient master: Show tags with stories in project view.  https://review.opendev.org/64223021:05
*** jamesmcarthur has joined #openstack-infra21:06
mordredclarkb: also - I verified that puppet-gerrit installs the gitweb package if gitweb is true21:07
clarkbgitea just updated I think21:08
clarkbopenstack/openstack works now21:08
fungivictory!21:08
*** boden has quit IRC21:12
mordredclarkb: I'm not 100% prepared to agree with you21:13
mordredoh wait - there it is21:13
clarkbit isn't quick. I think tagging commits every so many commits would help mitigate that21:16
clarkbsince that name-rev lookup is going back hundreds of htousands of commits21:16
clarkband then doing it for each commit that a file is most recent on/21:16
*** igordc has joined #openstack-infra21:19
*** Lucas_Gray has quit IRC21:20
*** rh-jelabarre has quit IRC21:21
clarkbmordred: for this networking stuff. One thing I notice is that it seems to happen a lot after what looks like a timeout21:22
clarkbwhich would lend credence to mnaser's suggestion it could be a new firewall or traffic shaper21:22
clarkbmordred: can we tell ansible to tell ssh to do ping pongs back and forth every minute or so?21:22
openstackgerritJason Lee proposed opendev/storyboard master: WIP: BlueprintWriter prototype, attempting bugfixes  https://review.opendev.org/65481221:22
paladoxbtw you may want to beware of gerrit 2.15.12, it apparently has some type of problem that is currently causing an outage for us.21:22
*** rfolco has quit IRC21:24
clarkbhrm we already set ServerAliveInteral to 6021:24
clarkbwhich should mean every 60 seconds ping pong21:24
clarkb(if you didn't get data otherwise)21:24
clarkband default ServerAliveCountMax is 1521:26
clarkbwhich means after about 15 minutes we should disconnect21:26
openstackgerritMerged opendev/base-jobs master: Update opendev intermediate registry secret  https://review.opendev.org/65522821:26
clarkber sorry it is 321:27
clarkbI misread the numbers in the manpage21:27
clarkbso 3 minutes21:27
jamesmcarthurHi clarkb: Trying to log into the wiki from https://governance.openstack.org/tc/reference/opens.html throws me back to openstack.org21:27
jamesmcarthurRelated to this recent migration or something else?21:27
jrossergit clone error http://logs.openstack.org/74/652574/3/gate/openstack-ansible-deploy-aio_metal-debian-stable/ecb0b7c/job-output.txt.gz#_2019-04-23_21_09_27_76833521:27
clarkbjamesmcarthur: shouldn't be related to the migration. We didn't touch the wiki21:28
*** tosky has joined #openstack-infra21:28
clarkbjrosser: ya we've been trying to figure out persistent connectivity issues between zuul and test nodes all day21:28
clarkbjamesmcarthur: where is the wiki login from there?21:29
jamesmcarthurseems to be something else going on21:29
jamesmcarthurI'm already logged in21:30
jamesmcarthurI'll open a ticket on our end and see if I can figure it out :)21:30
*** smarcet has quit IRC21:35
openstackgerritMerged opendev/storyboard-webclient master: Show all stories created and allows them to filter according to status  https://review.opendev.org/64237021:37
*** whoami-rajat has quit IRC21:40
*** tjgresha_nope has quit IRC21:41
*** tjgresha has joined #openstack-infra21:43
smcginnisjamesmcarthur: That's not the wiki.21:43
smcginnisjamesmcarthur: https://governance.openstack.org/tc/reference/opens.html is sphinx generated content.21:44
smcginnisBut I can confirm clicking on Log In from there throws back to /21:44
smcginnisNot sure what logging in there is supposed to do.21:44
openstackgerritMerged opendev/system-config master: Install socat on zuul executors  https://review.opendev.org/65457721:44
smcginnisjamesmcarthur: You would need to submit a patch for https://opendev.org/openstack/governance/src/branch/master/reference/opens.rst if you are trying to update that page.21:46
mordredsmcginnis, jamesmcarthur: yeah a) not sure what logging in to the four opens page is intended to accomplish - but also, https://www.openstack.org/Security/login/?BackURL=/home/ ... the BackURL is /home/ - which is unlikely to ever be correct :)21:46
mordredthat also was supposed to be b)21:47
smcginnis:)21:47
clarkbI'm going to take a break now since I feel like Im just spinning wheels with the netwpmring stuff. It seems slightly better and if fungi can catch one maybe we can debug (and then lossibly file a bug with $clouf)21:48
mordredclarkb: ++21:48
mordredjamesmcarthur: https://opendev.org/openstack/openstackdocstheme/src/branch/master/openstackdocstheme/theme/openstackdocs/header.html21:49
mordredthat's where the header is coming from21:49
mordredand https://opendev.org/openstack/openstackdocstheme/src/branch/master/openstackdocstheme/theme/openstackdocs/header.html#L11021:49
mordredis where the login link is coming from - with the nicely hard-coded /home/ as the BackURL21:50
mordredjamesmcarthur, smcginnis: since logging in to openstack docs isn't really a thing, maybe we sohuld just remove the login link from openstackdocstheme?21:50
jamesmcarthurah ha21:50
smcginnisI wonder if there is somewhere that is actually used.21:50
mordredotherwise I think we'd want to replace /home/ there with some javascript or something that sets an appropriate BackURL21:50
jamesmcarthurmordred: that's.. kind of an excellent point21:51
smcginnisIt might need some sort of conditional display.21:51
mordredsmcginnis: I'm guessing the html got lifted from soewhere else21:51
smcginnisCould likely be21:51
mordredbut there are zero times when we'll need a login page on published static docs21:51
smcginnisI do really wish gitea had a Blame button.21:51
jamesmcarthurwe provide a little javascript include with the openstack menu so that everyone that's using it can stay up to date21:51
jamesmcarthurbut it's definitely not applicable to docs21:51
fungiokay, sustenance has been consumed and i have 8 minutes to catch up before my next conference call21:51
jamesmcarthurlol21:52
mordredjamesmcarthur: yeah. that said - if we DID want to fix the login link, just for consistency, that seems fine21:52
openstackgerritMerged opendev/system-config master: Add script to automate GitHub organization transfers  https://review.opendev.org/64493721:52
mordredjamesmcarthur: and that way the javascript include would work and it would look integrated and whatnot :)21:52
mordredbut - you know - I leave all of that to your very capable hands :)21:52
smcginnismordred, jamesmcarthur: Looks like it was intentionally added since it is the first thing mentioned in the commit message: https://github.com/openstack/openstackdocstheme/commit/d31e4ded8941a69b36de413f1bcf56c91bece77921:53
mordredsmcginnis: weird. but also - good to know21:54
*** jcoufal has quit IRC21:55
*** jamesmcarthur has quit IRC21:55
* mordred needs to AFK21:56
fungiif asettle is awake already, maybe she remembers the reasons there? she was the one who approved that addition21:56
fungier, s/approved/committed/21:56
*** jamesmcarthur has joined #openstack-infra21:58
*** jcoufal has joined #openstack-infra21:58
fungiAJaeger was the one to approve it21:58
fungithree years ago yesterday in fact21:59
*** imacdonn has quit IRC22:01
*** ijw has quit IRC22:01
*** imacdonn has joined #openstack-infra22:01
jamesmcarthurYeah... it was done to try to solidify the various implementations of the openstack header.22:02
jamesmcarthurClearly worth a revisit :)22:02
corvussmcginnis: wish granted.  merged 4 days ago, probably will be in 1.9.0:  https://github.com/go-gitea/gitea/pull/572122:08
mriedemi just noticed this in a non-voting job in stable/pike but it's also in queens, looks like legacy jobs are now failing because of incorrect or missing required-projects on devstack-gate?22:12
mriedemhttp://logs.openstack.org/98/640198/2/check/nova-grenade-live-migration/370efe9/job-output.txt.gz#_2019-04-22_13_16_53_23377222:12
mriedemis that a known issue?22:12
mriedemseems to be after the opendev rename22:12
*** slaweq has quit IRC22:12
mriedemhttps://review.opendev.org/#/c/640198/2/.zuul.yaml@3822:13
mriedemnot sure if we need to change that in stable branch job defs now?22:13
mriedemi guess that's what this was for... https://github.com/openstack/nova/commit/fc3890667e4971e3f0f35ac921c2a6c25f72adec22:14
*** slaweq has joined #openstack-infra22:14
*** jamesmcarthur has quit IRC22:15
corvusmriedem: that change was approved despite the fact that the job it added failed when it ran (since the job is non-voting, it's not gating)22:20
mriedemyeah i realize that22:21
mriedemi'm wondering if i need to fix this devstack-gate thing on stable branches, or are there redirects in place?22:21
corvusmriedem: it needs to be fixed22:21
mriedemok22:21
mriedemah i see there were migration patches per branch, so this isn't as bad i thought it'd be22:27
*** amoralej|off has quit IRC22:27
mriedemjust the one job that missed it22:27
*** hwoarang has quit IRC22:32
*** hwoarang has joined #openstack-infra22:34
*** tonyb has joined #openstack-infra22:41
*** wehde has quit IRC22:47
*** tkajinam has joined #openstack-infra22:55
*** kranthikirang has quit IRC22:56
WeifanOur repo has been moved from openstack namespace to x. And our project is not longer published to pypi automatically after pushing a new tag.23:00
WeifanDoes that mean we should to replicate the repo from opendev to github? Or does anyone know how we can setup it so it would be published based on tags on opendev?23:00
WeifanOr is it suggested that we setup our own jobs to publish it?23:01
*** jcoufal has quit IRC23:10
*** aaronsheffield has quit IRC23:11
*** hwoarang has quit IRC23:13
*** tosky has quit IRC23:13
*** yamamoto has joined #openstack-infra23:14
*** hwoarang has joined #openstack-infra23:14
*** diablo_rojo has quit IRC23:14
*** jcoufal has joined #openstack-infra23:16
*** gmann is now known as gmann_afk23:17
*** yamamoto has quit IRC23:18
clarkbpypi and github are independent23:19
clarkbthe tag jobs should push to pypi regardless of github23:19
*** diablo_rojo has joined #openstack-infra23:19
Weifanit has been 1 day, and it is still not updated on pypi23:19
Weifanbut the tag can be found on opendev23:19
clarkbon https://zuul.openstack.org is a builds tab can you search for your pypi jobs there23:20
clarkbit probably failed for some reason23:20
*** yamamoto has joined #openstack-infra23:20
*** rcernin has joined #openstack-infra23:20
*** hwoarang has quit IRC23:22
*** hwoarang has joined #openstack-infra23:23
Weifanany suggestion on how to find the job? can't seem to find it....it was for https://opendev.org/x/networking-bigswitch23:23
clarkblet me see23:24
clarkbhttps://zuul.openstack.org/build/e8686ce24e04408aaef4f34c99bd7f2723:26
*** lseki has quit IRC23:26
openstackgerritJason Lee proposed opendev/storyboard master: WIP: BlueprintWriter prototype, additional bugfixes  https://review.opendev.org/65481223:27
clarkbthat may be the twine issue?23:27
clarkbI'm not in a good spot to debug that as I am on a phone23:27
Weifanlooks like the ansible task failed23:28
Weifanseems like all of them are failing right now, not just our project23:32
*** igordc has quit IRC23:32
fungiyes, we're still trying to work out the cause. it started happening roughly a week ago, so well before the opendev migration23:32
*** jcoufal has quit IRC23:33
fungithough it was intermittent until today-ish23:33
Weifanok, thanks for the information :)23:34
fungiWeifan: yep, same error in your log too... "Can not perform a '--user' install. User site-packages are not visible in this virtualenv."23:35
fungiwe'll reenqueue that tag object once we work out the fix, so no need to push a new tag for that23:36
*** gyee has quit IRC23:39
Weifanwould it be related to python3? release-openstack-python seems to have python3 as "release_python", but the job was on queens23:39
Weifanwhich probably uses py223:39
*** rlandy|ruck has quit IRC23:40
*** yamamoto has quit IRC23:41
*** yamamoto has joined #openstack-infra23:42
*** yamamoto has quit IRC23:43
fungii haven't been able to find a correlation by interpreter. the release-openstack-python job has been set to python3 since november23:44
*** hwoarang has quit IRC23:49
*** hwoarang has joined #openstack-infra23:51
*** mattw4 has quit IRC23:59

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!