Friday, 2019-05-24

openstackgerritIan Wienand proposed zuul/zuul-jobs master: validate-zone-db : add job and make more generic
openstackgerritTristan Cacqueray proposed zuul/zuul master: trigger: add job filter event
*** smarcet has joined #openstack-infra00:20
openstackgerritIan Wienand proposed openstack-infra/ master: Add zone-check job
ianwcorvus / fungi : thanks; updated to a more generic job00:55
openstackgerritTristan Cacqueray proposed zuul/zuul master: trigger: add job filter event
openstackgerritTristan Cacqueray proposed zuul/zuul master: webtrigger: add initial driver and event
fungigerrit task queue is down below 6k now01:29
paladoxgot to love the gerrit task queue :P01:30
openstackgerritTristan Cacqueray proposed zuul/zuul master: webtrigger: add web route and rpclistener
fungiand now below 5k02:10
clarkbspeeding up02:16
funginow below 4k02:30
*** sarob has joined #openstack-infra02:48
openstackgerritTristan Cacqueray proposed zuul/zuul master: web: add build button to trigger job
fungiand *finally* caught up03:14
openstackgerritMerged zuul/zuul master: ansible-config: pin ara to <1.0.0
openstackgerritMerged openstack/reviewstats master: Handle all exceptions loading pickled data
openstackgerritMerged openstack/reviewstats master: Switch default server to
openstackgerritMerged openstack/reviewstats master: Don't fail when writing cache either
openstackgerritTristan Cacqueray proposed zuul/zuul master: amqp: add basic trigger
openstackgerritTristan Cacqueray proposed zuul/zuul master: amqp: add message informations to the job variables
openstackgerritJoshua Hesketh proposed x/gearman-plugin master: add project parameters
openstackgerritJoshua Hesketh proposed opendev/system-config master: Add tab to link from repo page to gerrit changes
openstackgerritJoshua Hesketh proposed opendev/system-config master: Add tab to link from repo page to gerrit changes
openstackgerritTristan Cacqueray proposed zuul/zuul master: git: only list head references
openstackgerritIan Wienand proposed openstack/project-config master: Switch rax.dfw mirror to
openstackgerritIan Wienand proposed opendev/system-config master: Move openSUSE Tumbleweed into a caching mirror instead
evrardjpianw: thanks for these ^06:20
AJaegerconfig-core, want to review moving deploy-guide to tox-docs as well? and are ready for review, please06:23
openstackgerritdongwenjuan proposed x/gearman-plugin master: add project parameters
*** slaweq has joined #openstack-infra06:35
openstackgerritdongwenjuan proposed x/gearman-plugin master: add project parameters
openstackgerritdongwenjuan proposed x/gearman-plugin master: add project parameters
*** ramishra has joined #openstack-infra06:47
openstackgerritdongwenjuan proposed x/gearman-plugin master: add project parameters
openstackgerritTristan Cacqueray proposed zuul/zuul master: git: only list head references
openstackgerritMerged zuul/nodepool master: Use py3 pathlib in DibImageFile
openstackgerritdongwenjuan proposed x/gearman-plugin master: add project parameters
openstackgerritMerged opendev/system-config master: Cleanup bashate errors to make them easier to understand
openstackgerritdongwenjuan proposed x/gearman-plugin master: add project parameters
openstackgerritIan Wienand proposed openstack/project-config master: Switch rax.dfw mirror to
ianwclarkb: ^ note i added the volumes on the mirror as we discussed this morning too, so i consider it fully operational07:23
ianwAJaeger: thanks, i just realised i had the match on the region/provider backwards (wouldn't have broken, just wouldn't have done anything :)07:23
*** gmann has quit IRC07:28
openstackgerritTristan Cacqueray proposed zuul/zuul master: git: only list heads and tags references
openstackgerritMerged opendev/system-config master: Move openSUSE Tumbleweed into a caching mirror instead
openstackgerritMerged openstack/diskimage-builder master: Replace URLs with URLs
*** tkajinam has joined #openstack-infra08:04
openstackgerritTobias Henkel proposed zuul/zuul master: Support squash merge in Github
openstackgerritAndreas Jaeger proposed openstack/project-config master: Use tox for publish-deploy-guide
AJaegerfrickler: thanks for catching this ^08:19
openstackgerritdongwenjuan proposed x/gearman-plugin master: add project parameters
openstackgerritMerged openstack/pbr master: Remove neutron-lbaas
fricklerAJaeger: ah, great, that looks more like what I'd expected, thx08:21
*** ykarel is now known as ykarel|lunch08:21
openstackgerritFelix Schmidt proposed zuul/zuul master: Differentiate between queued and waiting jobs in zuul web UI
openstackgerritdongwenjuan proposed x/gearman-plugin master: add project parameters
openstackgerritHervé Beraud proposed openstack/pbr master: Read description file as utf-8
openstackgerritHervé Beraud proposed openstack/pbr master: Read description file as utf-8
zbrdid anyone asked if it would be possible to enable gitea issue tracker for specific projects?08:53
openstackgerritHervé Beraud proposed openstack/pbr master: Read description file as utf-8
*** gfidente has joined #openstack-infra09:01
openstackgerritdongwenjuan proposed x/gearman-plugin master: add project parameters and bindep file
openstackgerritFelix Schmidt proposed zuul/zuul master: Differentiate between queued and waiting jobs in zuul web UI
*** sarob has joined #openstack-infra10:04
openstackgerritFabien Boucher proposed zuul/zuul master: A reporter for Elasticsearch
*** markvoelker has quit IRC11:57
mordredzbr: it is not currently possible. each gitea behind the load balancer is completely independent. there is work needed upstream for us to be able to have a single clustered install - namely around indexing12:07
*** psachin has quit IRC12:08
mordredzbr: that's not saying that once that's fixed it will be decided to support such a thing - just that right now it's physically impossible12:13
*** Yamini has joined #openstack-infra12:20
Yaminihi team, am facing an issue in the CI system while running the CI 2019-05-24 12:21:22,231 DEBUG zuul.Repo: Cloning from ssh:// to /var/lib/zuul/git/
Yaminidid something change hitting it only from yesterday12:20
Yaminithe above cloning fails12:21
Yamini    repo.reset()   File "/usr/local/lib/python3.6/dist-packages/zuul/merger/", line 263, in reset     self.update()   File "/usr/local/lib/python3.6/dist-packages/zuul/merger/", line 427, in update     repo = self.createRepoObject()   File "/usr/local/lib/python3.6/dist-packages/zuul/merger/", line 256, in createRepoObject     self._ensure_cloned()   File "/usr/local/lib/python3.6/dist-packages/zuu12:21
Yamini2019-05-24 12:21:22,182 ERROR zuul.Merger: Unable to reset repo <zuul.merger.merger.Repo object at 0x7f3d41ba91d0> Traceback (most recent call last):   File "/usr/local/lib/python3.6/dist-packages/zuul/merger/", line 666, in _mergeItem     repo.reset()   File "/usr/local/lib/python3.6/dist-packages/zuul/merger/", line 263, in reset     self.update()   File "/usr/local/lib/python3.6/dist-packages/zuul/merger12:22
Yaminihas anyone seen similar error12:22
*** jpena|lunch is now known as jpena12:38
corvusi'm going to run another gitea gc pass now12:41
corvusYamini: can you paste the entire error and stack trace to and copy the resulting url here?12:47
openstackgerritFabien Boucher proposed zuul/zuul master: Pagure driver -
mordredinfra-root: I thought I was going to be AFK today - but the diving has been cancelled due to bad weather in the gulf - so I'm here today12:56
*** e0ne has joined #openstack-infra13:01
*** hwoarang has quit IRC13:07
fungisorry to hear about your last-minute change of plans, but glad you're here!13:07
corvusmordred: i'm not glad you're not diving but i am glad you're not diving in bad weather13:08
corvuslooks like some of the gcs are wrapping up now13:09
mordredcorvus: yes - I completely agree. it seems like spending three nights on a dive boat in the gulf of mexico in high seas would be ... unpleasant13:10
fungizbr: we do have the flexibility to point different projects to different issue trackers though (and currently use it to hook the issues link up to either launchpad or storyboard)13:13
zbrfungi: i know but there would be a huge benefit in terms of accessibility to have issue tracker integrated, especially for small/standalone projects.13:15
*** sreejithp has joined #openstack-infra13:15
*** sreejithp_ has joined #openstack-infra13:17
*** rfolco|brb is now known as rfolco13:18
*** sreejithp has quit IRC13:19
openstackgerritSorin Sbarnea proposed openstack/pbr master: Allow git-tags to be SemVer compliant
*** dpawlik has quit IRC13:23
corvusgitea gc complete13:39
corvusmordred, fungi: iiuc, in order to pick up the new gitweb links, we need to restart gerrit, which, ironically means another full replication?13:40
corvus(and yes, that's alanis-mode irony)13:42
*** Goneri has joined #openstack-infra13:45
fungiright, will take effect after the next gerrit restart13:51
fungirestarting it later today might make sense as we traditionally see lower volumes in the latter hours utc of friday leading into the weekend13:52
fungiwhich means 1. folks are less likely to be inconvenienced/notice, and 2. it may go faster when there's less contention for ci requests13:53
dmsimardSo, for some reason, "git clone --mirror" commands are failing very often from CentOS but not from Fedora. It seems like this started happening after the switch to opendev and we aren't able to reproduce on github. I'm still troubleshooting but I wanted to mention that in case it rang a bell13:54
*** betherly has joined #openstack-infra13:55
dmsimardgit clone --mirror commands are apparently ran when cloning puppet modules for a Puppetfile and I can't reproduce the issue without --mirror13:55
fungicould be related to the vintage of git client on each platform, and possible how they handle differences in protocol implementation server-side13:55
*** jistr|afk is now known as jistr13:56
dmsimardyeah, that's why I checked fedora :D13:56
openstackgerritMarkus Hosch proposed zuul/zuul master: WIP: Fix case sensitivity in codeowners check
*** quiquell has joined #openstack-infra13:58
*** bnemec is now known as beekneemech14:14
*** stephenfin is now known as finucannot14:14
openstackgerritTobias Henkel proposed zuul/zuul master: WIP: Fix case sensitivity in codeowners check
openstackgerritTobias Henkel proposed zuul/zuul master: Evaluate CODEOWNERS settings during canMerge check
*** _erlon_ has joined #openstack-infra14:22
openstackgerritMatthew Thode proposed openstack/diskimage-builder master: allow the use of non-bzip compressed stages for building gentoo
*** smarcet has joined #openstack-infra14:34
*** liuyulong has joined #openstack-infra14:37
AJaegerconfig-core, want to review moving deploy-guide to tox-docs as well? and are ready for review, please14:40
*** ykarel|meeting is now known as ykarel14:48
openstackgerritFabien Boucher proposed zuul/zuul master: Disable gc in test_scheduler.TestExecutor as done in base assertFinalState
openstackgerritFabien Boucher proposed zuul/zuul master: Pagure driver -
openstackgerritJean-Philippe Evrard proposed opendev/system-config master: Start mirroring openSUSE Leap 15.1
openstackgerritJames E. Blair proposed zuul/zuul-jobs master: WIP: registry test job
*** piotrowskim has quit IRC15:06
*** kopecmartin is now known as kopecmartin|off15:06
*** e0ne has joined #openstack-infra15:11
openstackgerritDavid Moreau Simard proposed opendev/puppet-zuul master: Pin ARA to <1.0.0
openstackgerritDavid Moreau Simard proposed zuul/zuul master: Pin ARA to <1.0.0
fungijust got around to confirming, but the "openid" tab is now gone from and the old url it used to take you to now renders a blank page15:17
fungiso seems that hole is effectively plugged as far as i can see15:17
*** jamesmcarthur has joined #openstack-infra15:19
*** Lucas_Gray has joined #openstack-infra15:22
openstackgerritJason E. Rist proposed openstack/project-config master: Retiring TripleO-UI
*** pgaxatte has quit IRC15:25
*** owalsh has quit IRC15:25
mordredfungi: \o/15:29
clarkbcatching up, disk usage looks good after this mornings gc15:34
clarkbI've rechecked the cron installation change as it failed on the puppet idempotency check15:35
clarkbmaybe we should stop caring about idempotency as it races repos like project-config merges15:36
AJaegerinfra-root, should we use ansible 2.7 for testing in jobs ? See and , please15:37
corvusAJaeger: if this is abount ansible-lint, didn't we decide that 2.8 was too agressive?  should we drop it altogether?15:40
corvus(if we aren't going to be able to use 2.8, i don't see the point in continuing to use 2.7)15:40
AJaegercorvus: it's only about *ansible*, no changes for lint.15:40
openstackgerritClark Boylan proposed opendev/puppet-openstackci master: Remove idempotency spec checks
corvusAJaeger: what is ansible's purpose here?15:41
clarkbthat change should remove the idempotency checks from the openstackci beaker/spec jobs15:41
*** roman_g has quit IRC15:41
corvus(maybe the comment about why we're installing ansible needs updating too)15:42
corvusi'm looking into the cronspam errors from the new mirror server15:42
* AJaeger can abandon as well if it's not needed...15:43
corvusAJaeger: i don't know the right course of action here, i barely understand the linting jobs15:43
AJaegercorvus, let me try without ansible, you have a good point there...15:43
*** roman_g has joined #openstack-infra15:43
corvusAJaeger: that should at least answer the question15:44
corvusregarding the mirror -- there seems to be an error in the playbook15:44
AJaegercorvus: since zuul does not require ansible anymore, we could remove the line..15:44
corvusfailed: [] (item=wheel) => {"changed": false, "gid": 1000, "group": "ubuntu", "item": "wheel", "mode": "0755", "msg": "chown failed: [Errno 30] Read-only file system: b'/afs/'", "owner": "daemon", "path": "/afs/", "size": 2048, "state": "directory", "uid": 1}15:44
corvusthe issue is that the ownership of the directory in afs is daemon/wheel whereas the playbook seems to want the link target owner to be root/root15:45
corvusi don't know why the playbook cares about that, but set that aside for the moment15:45
corvusthe thing i don't understand is why this did not fail in testing15:45
*** rpittau is now known as rpittau|afk15:45
corvushere you can see that it set up the link without complaint:
corvuswhat if it works the first time but not later times?15:45
mordredoh. piddle15:46
*** jamesmcarthur has quit IRC15:46
clarkbfitting that I would propose a change to remove idempotency checks and we discover somewhere else that having that would be nice :)15:46
mordredcorvus: do we actually have the AFS client going in those tests?15:47
corvusclarkb: yeah, i was wondering if anyone would notice that :)15:47
corvusmordred: yes15:47
mordredkk. so it's not that15:47
corvusthere's even an explicit verification that afs is mounted and working15:47
*** kjackal has quit IRC15:48
corvushrm, i can't back up my not-idempotent theory looking at the run logs15:49
*** zul has quit IRC15:49
*** ijw has joined #openstack-infra15:53
*** jamesmcarthur has joined #openstack-infra15:55
openstackgerritJames E. Blair proposed opendev/system-config master: WIP: Fix new mirror system errors
*** kjackal has joined #openstack-infra15:56
corvusthat collects two minor issues i've observed so far, and attempts to debug the perm issue15:56
*** ykarel is now known as ykarel|away15:56
*** jamesmcarthur has joined #openstack-infra16:03
*** e0ne has quit IRC16:04
AJaegerremoving ansible from tox.ini, ansible-lint drags in ansible 2.8 and we get test failures, so I won't add additional changes16:10
clarkbwe are setup to do 4 replication threads per gitea target. That explains why we use ~4 cpus on the gitea side I guess16:12
clarkbso we could use all 8vcpus if we bumped the number of replication threads16:12
corvusoh hey, looks like the test confirms the idempotency hypothesis16:17
clarkbansible bug?16:17
corvusmaybe... this should be easy to test/reproduce16:18
*** lucasagomes has quit IRC16:19
corvusyep, reproduces locally just by creating a link to a target owned by root16:20
*** bobh has joined #openstack-infra16:23
corvusi think we just want to set "follow" to true16:23
corvuser false16:23
*** bobh has quit IRC16:23
clarkbwe want to set the perms on the link itself not the target so that makes sense16:23
corvusyep, that seems to work.  so yeah, i think there's still a minor bug in that in order to be idempotent, it really should check the perms after creating the link if follow is set to true (which is the default post 2.5)16:24
corvusi'll file that and then fixup my patch16:25
beekneemechHaving to do extensive work on GitHub recently has made me appreciate our infra even more than I already did.16:26
beekneemech#thanks infra team for providing such a great place to do development16:26
openstackstatusbeekneemech: Added your thanks to Thanks page (
clarkbbeekneemech: awww thanks16:27
clarkbThis reminds me16:28
clarkb#success Infra puppetry now running puppet-4 everywhere but arm64 hosts and disabled (for cfgmgmt) hosts16:29
openstackstatusclarkb: Added success to Success page (
mtreinishbeekneemech: I definitely can second that sentiment. I miss infra all the time working mostly on github projects these days16:29
*** ccamacho has quit IRC16:30
*** jamesmcarthur has quit IRC16:34
clarkbinfra-root I see the certcheck emails16:36
clarkbUsually I try to get those down when they get down to about a weeks left16:37
*** hjensas has quit IRC16:37
clarkbthat way we maximize cert usage on the old cert but give us time to get things replaced before expiration16:37
clarkbAs with survey and I expect these will all get 2 year certs16:37
corvusclarkb, mordred:
openstackgerritJames E. Blair proposed opendev/system-config master: Fix new mirror system errors
corvusinfra-root: ^ that should fix the mirror cronspam16:43
clarkbcorvus: bug seems to capture what I expected was the issue16:43
clarkb+2'd the fix on our end16:43
corvusyeah, arguably there's another bug which is "follow should be false by default if state is link"16:44
fungisorry, popped out to the pub for a nice friday lunch. lots of highlights so catching back up now16:44
corvusbut i seem to have misplaced my can-of-worms-opener, so that one will have to wait16:45
clarkbcorvus: I think I left mine on the boat wednesday :)16:45
*** kjackal has quit IRC16:45
corvusclarkb: you should go back for it!  let me know if you need help....  :)16:45
fungibest part of scrollback was beekneemech's praise. the rest i no longer care about! ;)16:47
fungiclarkb: i'd be cool with redoing survey to be and not renewing the cert for it since it's still more or less in beta16:48
clarkbfungi: too late it expired right after the ptg and I took care of it16:48
fungioh, yeah that's right. i've repressed all memories of that week16:49
*** diablo_rojo has quit IRC16:51
toskyclarkb: the logs-dev change seems to work; I added few comments to the review and spammed a lot of people (
toskyso now it's more an architectural question: should we assume that the logs viewer shows those files and remove the custom handling, or leave it in place because "you can never know"?16:56
clarkbtosky: I think we can probably enable the dotfile viewing on the production vhost16:57
toskycan we assume the same for any other zuul instance? I mean, are those settings part of the standard zuul configuration, or are those custom settings for
clarkbtosky: zuul doesn't enforce log hosting so that is per deployment16:58
toskyclarkb: should the code be kept (and improved) in zuul-jobs then, as zuul-jobs is for general consumption, or does it make sense to at least document this?16:59
clarkbtosky: it is an apache specific behavior. We could add a note somewhere that if hosting logs with apache dotfiles may need special config to be indexed17:00
openstackgerritJason E. Rist proposed openstack/project-config master: Retiring TripleO-UI
*** ricolin has quit IRC17:03
clarkbthe change to add the gc cron failed on the nodepool idempotency check due ot project-config updates I think17:07
*** quiquell has quit IRC17:07
clarkbif it fails again I think I'll have to dig in more17:07
*** adriancz has quit IRC17:08
corvusi'm +2 on :)17:09
*** mattw4 has joined #openstack-infra17:09
fungiand i'm +317:10
fungistill catching up on high-priority changes which came in while i was at the pub17:10
Shrewsfungi: you were in a pub... on a friday... at the beach. Why did you leave again????17:11
fungiwho says i did? ;)17:12
Shrewsprobably best to leave before all of the out-of-towners arrive today, anyway17:12
*** quiquell has quit IRC17:12
fungiyeah, this was more of a last hurrah before we go into hiding17:12
fungii'll stick my head out of my hole after labor day and if i see my shadow then that means six more weeks of tourists17:13
clarkbis that this weekend?17:13
fungithis weekend is memorial day17:13
fungi(in the usa)17:13
fungithe height of tourism here generally runs from usa memorial day weekend through usa labor day weekend17:14
*** ykarel|away has joined #openstack-infra17:14
clarkbI was debating a trip to the coast or similar this weekend but may have to skip that and do things around the house instead17:15
fungias beaches go, we at least still have somewhat of a clear off-season. i envy locales like bar harbor though17:15
*** mattw4 has quit IRC17:15
openstackgerritClark Boylan proposed opendev/puppet-openstackci master: Apply autoindexing of dotfiles to prod vhost
clarkbtosky: ^ that should get it on the prod vhost17:18
*** Lucas_Gray has quit IRC17:24
openstackgerritMerged opendev/puppet-openstackci master: Remove idempotency spec checks
clarkbI am able to reproduce RDO'17:43
clarkbI am able to reproduce RDO's git fetch origin --prune problem17:43
clarkbhowever, interestingly it seems to delete refs then add them back again?17:45
clarkbmakes me wonder if a specific backend is unhappy?17:45
clarkbya it totally deleted a ref then add it back when I ran the command again17:45
* clarkb tries against specific backends17:45
* corvus just verified that 06 is not in the lb config17:46
*** hjensas has joined #openstack-infra17:49
fungii also replied to their e-mail requesting some more refined scoping on when they saw the problem arise17:49
clarkbI seem to have no trouble when talking to a specific backend too. (though I didn't clone from specific backend then prune so I'm gonna do that next)17:53
clarkbis it possible that the load balancer is causing the problems?17:53
*** pcaruana has quit IRC17:53
fungicould it be that differences in packing are causing serialized requests split between different backends to return nonsensical results to the client? and newer git is either better at dealing with that or makes its requests in such a way that haproxy is sending them all to the same place?17:56
corvusmaybe we should switch from 'leastconn' to 'source'17:57
clarkbcorvus: ya that is what I'm suspecting. But also I notice that there are fewer objects on gitea0717:58
corvusthough that may end up funnelling all $bigcorp requests to one server, thanks to pnat17:58
clarkbgitea07 had the extended outage from the git config lock file17:58
clarkbpossible we didn't actually replicate everything against that one?17:58
corvusclarkb: seems likely...17:58
corvuswe can trigger a replication to just that host17:58
corvusor we could go ahead and do the gerrit restart17:58
clarkbmy hunch is that each gitea server is internally consistent but if requeists bounce between them 07 having fewer refs than the others can lead to trouble17:58
corvusclarkb: we can drop 07 from the lb config temporarily to verify the hypothesis17:59
clarkband yup I did mirror clones and prunes confined to individual gitea servers and everything was fine17:59
clarkbcorvus: That is a simple thing to do. Did you want to do it or should I?18:00
corvusi will18:01
clarkbI've been using openstack/puppet-keystone fwiw18:01
corvuswe lost our haproxy docs18:02
corvusi'll re-add them in a few mins18:02
clarkbcorvus: check my history or roots history for socat commands18:02
clarkbI did things for 06 during ptg18:02
fungilost? did we inadvertently "clean them up?"18:02
corvusi checked out an old commit18:02
fungiahh, i guess they were co-mingled with the cgit docs18:02
*** Weifan has joined #openstack-infra18:03
corvusclarkb: done; want to try now?18:04
clarkbcorvus: it is still trying to delete refs18:05
clarkb(and hanging)18:05
*** eernst has joined #openstack-infra18:05
corvusi should clarify i only removed gitea07 from the https pool18:05
corvusit's still in the http pool, but that shouldn't matter18:05
clarkbI cloned from https so wouldn't have hit the redirect even18:06
clarkbmaybe the next thing to try is force replication against just puppet-keystone18:06
clarkbso ya removing 07 doesn't seem to have fixed it18:06
corvusclarkb: i'll add it back in, and let you force replication if you want to try that next18:07
corvusit's back18:07
openstackgerritMerged opendev/system-config master: Fix new mirror system errors
clarkbreplication start openstack/puppet-keystone is the ssh command to trigger against a specific project18:08
clarkbok that seemed to help. I got through two cycles of clone --mirror and fetch --prune but third appears to have gotten stuck18:11
clarkbI notice that each time I cloned there are different object counts18:11
clarkb(could be due to different gc'ing?18:11
fungii would suggest we don't --all and instead let the gerrit restart we're wanting to do today take care of that18:11
clarkbwe might compare the packed refs files between the gitea nodes?18:12
clarkbfourth attempt also hanging18:12
clarkbthe packed-refs files do differ according to md5sum18:14
*** eernst has quit IRC18:18
fungiyeah, possible that if part of the request ends up at a different backend then the client could get confused18:19
clarkbya so I've confirmed that 07 has fewer packed refs than 01, but 07 has the refs they just happen to be in refs/changes and aren't packed18:19
clarkband if the client is fetching from one location then another it may get confused?18:20
openstackgerritJames E. Blair proposed opendev/system-config master: Add gitea docs
clarkbwe likely do need to set the balance type to source18:20
corvusaccording to the docs, we used to18:20
clarkbeither that or someone get the cluster into sync as far as repo format goes18:20
fungimaybe newer git pipelines subsequent requests?18:20
corvusi wonder why we changed it18:20
fungibut yeah, sounds like we should just go back to source hash18:21
corvusour gc crons are always going to race new refs :/18:22
clarkbI expect that none of this would be an issue if they didn't both mirror and prune18:24
clarkbone or the other is likely fine18:24
corvus is relevant18:24
clarkbthat said mirroring and pruning should be valid18:24
fungiyeah, even if we pointed all the servers at a single backend filesystem (via afs or something) the cron could still cause this by packing refs between subrequests18:24
corvusfungi: though that is far less likely18:25
clarkbcorvus: fungi it is also possible that gitea runs the server side with more state between requests?18:26
fungii agree, it's a less likely race, but it is still a potential race18:26
corvus is the previous change18:26
clarkbcgit may have cared less about this18:26
*** e0ne has quit IRC18:26
clarkbor rather git web or whatever the cgi thing was called18:26
fungigit smart http backend18:27
corvus(depends on whether they had /cgit/ in their clone urls)18:27
corvus is the previous to that18:27
fungioh, right, cgit also had its own18:28
clarkband ya actually strace shows it hangs on a read18:29
*** armax has joined #openstack-infra18:29
clarkbso it could be that gitea1 get requiests and starts handling it then gitea02 gets next requests and doesn't have the state to properly handle it?18:29
clarkband so connection hangs18:29
clarkbthough I suppose that state could be the delta in repo structure18:29
clarkbwe do have a lot of headroom on the cpus for the gitea servers18:30
clarkbwe can probably try source and monitor it18:30
corvus is the previous change18:31
*** e0ne has joined #openstack-infra18:31
corvuswhat i'm getting out of this is that we have alternated between being more concerned about data integrity vs load. :)18:31
corvusthose changes list all the potential pitfalls -- including not only uneven load because of NAT, but also due to clones of nova piling up18:32
corvus^ pitfalls of source18:33
corvusand of course, we previously identified the pitfalls of leastcon -- deltas between hosts18:33
fungiour alternating between concerns of data integrity and load distribution is not a surprising observation18:33
fungibut it is good to weigh in the current determination18:33
corvusi would really love a source-leastconn18:33
fungiits absence suggests there's probably a complexity in actually implementing one which we're not aware of18:34
corvusa lot of state18:35
corvusand cpu18:35
fungithe stereotypical villians in any high-throughput proxy tale18:35
corvusif we terminated tls at the lb, we might be able to use gitea session cookies18:36
* mordred is back from his sandwich and notices fun with load balancers and git18:36
openstackgerritDavid Shrewsbury proposed zuul/zuul master: WIP: Store hold requests in zookeeper
fungiis git smart enough to actually return cookies?18:37
clarkbI need to pop out now for late breakfast/early lunch. Will be back in a bit. I'm happy to try source as an easy change that will give us more data18:37
corvusno idea!18:37
corvusclarkb: if we want to do that, we should really fix up the grafana dashboard --
corvusi mean, we need to do that anyway18:38
corvusbut i'd really like to have all our metrics before we do that18:38
corvusthe stats socket is available on the host18:38
corvusit should just be a matter of getting the haxproxy-stats script running and pointed at the right location18:38
fungiif it helps, the daily gc cron firing across all backend son a roughly synchronous schedule coupled with fairly synchronous replication to them will probably make this less painful18:39
clarkbfungi: does gc always pack objects?18:39
fungidunno, i really am not sure18:40
fungiwhat factors could cause it to only sometimes pack objects>18:40
corvuscould be that's why we didn't notice it before -- everything was more in sync18:40
mordredyah. and occasional hiccups could just be explained away as "sometimes the internet messes up18:41
clarkbfungi: disk usage heuristics maybe?18:42
corvusi also need to do lunch; unless anyone is interested in jumping on it now, i'll work on getting haproxy-statsd up after lunch18:43
corvusi guess that should be a container image now18:44
*** pcaruana has joined #openstack-infra18:48
fungifirefox also continues to give me corruption/protocol errors on urls after a gitea restart, forcing me to force reload the first new tab i open. weird browser-side caching side effects i guess18:48
*** xek has joined #openstack-infra18:48
*** pcaruana has quit IRC18:48
openstackgerritMerged opendev/system-config master: Add cron to gc on gitea servers
*** xek_ has joined #openstack-infra18:53
*** xek has quit IRC18:54
fungi"Corrupted Content Error: The site at has experienced a network protocol violation that cannot be repaired. The page you are trying to view cannot be shown because an error in the data transmission was detected. Please contact the website owners to inform them of this problem."19:14
*** xek_ has quit IRC19:14
fungiand the "try again" button it provides just repeats that error. but a force refresh in the tab brings up the expected content19:15
*** tosky has quit IRC19:16
funginext time i see it, i'll try going to about:debugging#workers and unregistering
*** whoami-rajat has joined #openstack-infra19:18
fungisee if that's involved at all19:18
fungithough according to the warning on that panel, i may also have to disable "multiple content processes" in the browser19:20
corvusmultiple tabs open?19:20
openstackgerritJames E. Blair proposed opendev/system-config master: WIP: Add haproxy-statsd to haproxy server
corvusclarkb, fungi, mordred: ^19:34
*** ykarel|away has quit IRC19:35
*** eernst has quit IRC19:35
corvusi believe that's ready to go -- i only set WIP because i cut the jobs list to get a quicker first test run19:36
fungimordred: status19:40
fungiianw was working on static, last i recall19:41
mordredfungi: awesome.19:41
mordredthere are some things we run on status that it seems like maybe we should stop running ...19:42
mordredfor instance - looking at apache access logs, it seems like reviewday is there solely for the bots19:47
fungithe nova team has been discussing reviewday in their weekly meetings at least, and i unstuck it last week at their request19:49
mordredoh wow19:49
mordredok. I guess there is someone still using it then19:49
mordredit has 50 non-bot hits in the last 4 days, fwiw (by my best stab at a quick filter of that)19:50
mordredfungi: what about bugday?
corvusdoes reviewday have a maintainer to opendevify it?19:50
fungilooks like it's basically dprince, judging from
fungiand last commit of substance was a year ago19:52
fungimordred: no clue about bugday19:52
corvusthough, interestingly, that commit made it more opendev friendly :)19:52
fungiyup, i thought the same19:52
fungimordred: i haven't heard anyone ask about bugday in a while, so it may not be in use at all19:53
corvusmaybe reviewday is ready, or only needs minor tweaks (like stylesheets)19:53
mordredmostly trying to remove cruft before updating, rather than the other way around19:53
funginova folks also may only have a passing interest in reviewday, i'm not sure19:54
fungithe other thing it's been providing is custom review dashboards, but those could probably get relocated19:54
fungis/dashboards/one dashboard/19:56
fungithe neutron one19:56
openstackgerritAndreas Jaeger proposed openstack/os-testr master: Fix warning message with double "to"
clarkbI think ttx was involvedi nreviewday once upon a time19:56
openstackgerritMonty Taylor proposed opendev/system-config master: Remove bugday from status.o.o
mordredthere ^^ sake of argument19:57
fungimakes for an excellent strawman at least19:57
WeifanSorry to interrupt..Does anyone know if there is a way to let zuul run py35 or later for openstack-tox-pep8?19:58
WeifanCurrently if I configure tox_envlist to py35 or py36 in .zuul.yaml to ,it would throw an InterpreterNotFound Error.19:58
WeifanThe issue is that, by default it used python3.4 for pep8, and it would fail when installing requriements.txt and test-requirements.txt, using upper-constraints in requirements, as requirements repo no longer care about python3.4.19:58
WeifanOr is it suggested we use py27 for openstack-tox-pep8?19:58
*** rfarr__ has joined #openstack-infra19:58
fungiWeifan: switch to a newer node type19:59
fungiubuntu-trusty came with python 3.4, ubuntu-xenial with 3.5, ubuntu-bionic has 3.6 and 3.7 available19:59
clarkbcorvus: the haproxy statsd container change lgtm19:59
*** rfarr_ has joined #openstack-infra20:00
Weifanfungi: thanks, i'll try that20:00
clarkbour default nodeset node is bionic now isn't it?20:01
fungiyes, but many jobs set an explicit nodeset20:02
fungiWeifan: for context, openstack kilo through newton used ubuntu-trusty for testing, ocata through rocky used ubuntu-xenial, and stein through now use ubuntu-bionic20:02
*** rfarr__ has quit IRC20:03
Weifanthat's a bit weird, as the issue is on master20:03
fungiWeifan: have an example change where you're seeing it?20:03
fungizuul job artifacts include all the inheritance information necessary to determine where it's being set20:04
Weifani did some hack to workaround it, can't really find the build report.. but heres what the .zuul.yaml looks like:20:07
Weifan- project:20:07
Weifan    templates:20:07
Weifan      - openstack-python-jobs-neutron20:07
Weifan      - openstack-python35-jobs-neutron20:07
Weifan      - openstack-python36-jobs-neutron20:07
Weifan      - openstack-python37-jobs-neutron20:07
Weifanproject is x/networking-bigswitch20:08
Weifanthis would fail to install requirements due to python version issue20:09
Weifanonly for pep820:09
clarkband you don't have logs?20:09
clarkb(it is really hard for us to debug things without a link to examples)20:09
fungieither a link to a change the failure was reported on, or a link to the logs from that20:10
Weifani made a different commit so couldn't find the old one20:10
openstackgerritAndreas Jaeger proposed openstack/os-testr master: Fix warning message with double "to"
Weifannot sure if you can find build history..20:10
clarkbthat may explain it?20:10
Weifanthats the only one20:11
Weifansome of the commits i was trying20:11
Weifanhad that removed20:11
fungilabel: ubuntu-trusty20:11
openstackgerritAndreas Jaeger proposed openstack/os-testr master: Fix warning message with double "to"
fungithe inheritance path on that is somewhat lengthy, so will take a moment to dig through20:12
clarkb it is going through that20:13
clarkbso my suspicion is probably correct20:13
clarkb remove that line20:13
fungii agree20:14
Weifanso i think if i remove the openstack-python-jobs-neutron, and just add py27 and pep8 jobs, it should be fixed?20:14
fungiremove openstack-python-jobs-trusty20:15
clarkbWeifan: those jobs are already added.
fungiyou already include openstack-tox-pep8 and openstack-tox-py27 explicitly anyway, and those will fall through to using ubuntu-bionic20:15
Weifanthese are the failing reasons of old failures... but for the PRs it was a bit different.. i already removed it in all my PRs20:17
Weifanbut it was still failing due to py3420:17
Weifanthat is why i pasted the .zuul.yaml that i remembered20:18
fungiyes, py34 was available on ubuntu-trusty but on ubuntu-xenial it's py35 and ubuntu-bionic is where py36 and py37 jobs will be run20:18
Weifanissue was that when i looked at the log, it said python is 3.420:19
Weifani think i tried 3.5 and 3.620:19
Weifananyway, i'll try the new nodeset20:19
Weifanthanks for the help20:19
*** slaweq has quit IRC20:19
fungithen that was probably an ubuntu-trusty node, if you omit the nodeset override for those job variants then 3.6 should work20:20
mordredinfra-root: ooh - we have a too-many-open-files notice on codesearch20:20
fungibecause it will run on an ubuntu-bionic node where python3.6 is the default python320:20
openstackgerritAndreas Jaeger proposed openstack/project-config master: Remove python-jobs-trusty
AJaegerWeifan: you need this as well ^20:21
*** Lucas_Gray has joined #openstack-infra20:21
openstackgerritAndreas Jaeger proposed openstack/project-config master: Remove in-tree job from networking-bigswitch
AJaegerWeifan: and this is also important ^20:23
clarkb~9 minutes to haproxy statsd container test results20:24
corvusthe sheer amount of stuff this is testing with no additional test-specific code is amazing :)20:27
clarkbI'm so happy the registry stuff is stable now (or appears to be and now I am knocking on my desk)20:27
WeifanAjaeger: thanks:)20:29
mordredcorvus: ++20:29
clarkbspeaking of stable registry stuff, any idea if placement has looked into that more mordred ?20:30
AJaegerWeifan: I commented on your review, you can remove even more. First the project-config changes need to merge before your change in networking-bigswitch can pass20:30
mordredclarkb: nope - no idea20:31
corvusit's green20:32
corvusi guess i didn't need to drop the jobs20:33
*** EvilienM is now known as EmilienM20:33
AJaegerWeifan: also, backport to your stable branches...20:33
corvusoh, there is an error though:
corvus(not executable)20:34
WeifanAjaeger: I was thinking about making another commit to remove the pep8 thing once those changes are merged20:34
clarkbcorvus: yay for testing20:34
clarkbcorvus: as a heads up I pointed airship via to the zuul-jobs docker stuff20:34
corvusi'm not sure how to verify that it started up20:35
WeifanAjaeger: for stable banches, they are a bit different from master, i'll fix them with a different pr..20:35
clarkbcorvus: ps?20:35
clarkbI guess it is a one shot thing so ps won't help much20:35
fungior may be able to check references on the executable20:35
clarkbcorvus: could tcpdump for port 8125 traffic20:35
fungioh, right, not persistent process20:35
AJaegerWeifan: same idea ;)20:36
fungiso same problem as ps20:36
corvuswell, it is persistent20:36
corvusbut right now it's in a crash loop20:36
corvusso a ps or docker ps might race20:36
clarkboh it does run in a while loop.20:36
fungialso a great point20:36
clarkbcorvus: if you fix the crash wouldn't it be ps'able?20:36
fungicould result in false successes20:36
AJaegerWeifan: your change will not pass this run since project-config is still active, so wait until that is merged and then update ;)20:36
clarkbfungi: oh I see20:37
fungiclarkb: it could be ps'able now if you get lucky20:37
corvusright that20:37
clarkbcould check that docker only started one20:37
clarkbvia docker ps -a ?20:37
WeifanAjaeger: Sure :)20:37
corvusoh hrm.  ok i'll look into that20:37
fungiis log inspection an option?20:39
corvusclarkb: should i add to your airship change too?20:39
corvusfungi: yes20:39
clarkbcorvus: oh probably20:39
fungior, lots of extra work but, stand up a statsd service, configure to send to it, and then check that stats were received and recorded?20:40
clarkbfungi: ya the tcpdump idea was cheap version of ^20:41
clarkbI think both will work20:41
corvusmaybe after we rework the statsd stuff to containers?20:41
fungiright, both that and the tcpdump option at least confirm the service has started sufficiently to send a stats datagram20:42
fungianyway, i asked about logging because i see the script is already set up to use python logging20:42
*** aspiers has joined #openstack-infra20:43
openstackgerritMerged openstack/project-config master: Remove python-jobs-trusty
openstackgerritMerged openstack/project-config master: Remove in-tree job from networking-bigswitch
*** smarcet has joined #openstack-infra20:46
mordredclarkb: there is part of me that wants to do a more extreme rework of some of the things on status - and another part of me that does not want to do that and is trying to convince the first part to shut up21:00
clarkbmordred: fwiw the way I've approached it is trusty is rip already. Xenial gives us ~2 years to scratch those itches21:01
clarkbso doing the minimum to get to xenial is a win21:01
mordredbecause the rabbit hole could go deep21:01
mordredclarkb: what do we use the jenkins user for on status.o.o ?21:03
openstackgerritJames E. Blair proposed opendev/system-config master: Add haproxy-statsd to haproxy server
clarkbmordred: uh21:03
mordredclarkb: I can't find anything we use it for21:04
corvusthat's ^ updated with a chmod a+x and a test that it's running21:04
corvusi used docker inspect and parsed the json in testinfra21:04
clarkbmordred: that may be from when we had the recheck tracker?21:04
clarkbcorvus: do you need to tell testinfra to run docker privileged?21:05
clarkbotherwise that looks great21:06
fungicorvus: i'm probably blind, where was the chmod added?21:06
clarkbfungi: its on the file in git21:06
fungiyes, blind21:06
clarkbgerrit web ui shows it as new mode at top of "diff" that isn't a diff21:06
fungigertty doesn't actually show the mode change, but does show an empty diff of that file which should have tipped me off21:07
*** sreejithp_ has quit IRC21:07
corvusclarkb: we have a "docker exec" call in test_gitea so i think that'll work21:10
*** slaweq has joined #openstack-infra21:11
*** sreejithp has joined #openstack-infra21:11
clarkbI've approved it21:12
*** mriedem has quit IRC21:13
fungi~when do we want gerrit restarted? probably soonish i guess so we're still around in case it fails to restart21:13
fungior would we rather wait for git gc to fire on them all over the weekend and then restart early in the week?21:14
fungii'm up for a restart any time now, but understand if there's a concern that will muddy our investigation into the suspected load balancing behaviors21:15
openstackgerritMonty Taylor proposed opendev/system-config master: Remove jenkins user from status.o.o
corvusi'm okay putting off the restart a bit in favor of lb investigations21:17
fungifine by me, it was really just for the gitweb->gitea link swap, which isn't urgent21:18
corvusalso, as we continue or work to make more system config playbooks, we're really going to need the zuul feature that runs a job when its own configuration changes, otherwise, we'll get to the point where .zuul.yaml changes use 10% of our total quota :)21:19
corvusi count ~43 nodes for this change21:22
corvusso, currently only 5% of our quota.21:23
*** slaweq has quit IRC21:24
*** rossella_s has quit IRC21:24
*** rh-jelabarre has quit IRC21:27
corvusclarkb: i finished green mars on my recent trip -- still enjoying that, and i got the first 3 books of the expanse series (they were in a discounted box set...)21:39
*** e0ne has quit IRC21:40
clarkbcorvus: I just picked up red moon and am about 100 pages into it21:40
clarkbI expect you'd probably enjoy it but I am nowhere near the end21:40
fungii'm all about setbacks, evolutionary or otherwise21:40
mordredI'm in th emiddle of season 2 of the expanse now and enjoying it21:43
corvusmordred: clarkb enjoyed the first few books and persia enjoyed all of them, so i figured i'd at least grab a handful and see where i ended up :)21:44
corvusall jobs green -- just need to do it one more time and it'll merge :)21:56
*** tdasilva has quit IRC22:06
*** jamesmcarthur has joined #openstack-infra22:12
*** hwoarang has joined #openstack-infra22:14
clarkbit failed the run gitea change22:22
clarkbdoes that imply it was trying to pull haproxy-statsd from dockerhub?22:23
clarkbpossibly because docker-compose pull runs as a different user than the one we set up with the buildset registry?22:24
*** slaweq has quit IRC22:25
*** rfarr_ has quit IRC22:25
clarkbhrm I would expect that type of failure to affect the check jobs too22:30
clarkb starting frmo there we pushed a tag to buildset registry, then dockerhub, then intermedaite registry22:32
clarkbso the image should've been findable even if it looked at another location22:32
clarkbis it possible that is a race between upload to dockerhub and it being pullable?22:35
*** _erlon_ has quit IRC22:36
clarkboh :latest is not found22:36
corvusit shouldn't be pulling from dockerhub there, it should still be using the buildset registry22:36
clarkb and that is what we pull22:36
clarkbah ok well at least I can reproduce the failure locally because :latest doesn't exist on dockerhub22:37
clarkbso maybe it isn't pulling from buildset registry for some reason?22:37
clarkb the build side did tag a :latest22:38
corvususe-buildset-registry was skipped22:39
*** Weifan has quit IRC22:39
*** Weifan has joined #openstack-infra22:40
corvusi see it22:40
openstackgerritJames E. Blair proposed opendev/system-config master: Add haproxy-statsd to haproxy server
corvuslegit error :)22:40
clarkbah the check vs gate jobs22:41
fungioh, so it fired too early?22:41
corvusyep, and that caused the gitea job not to be a dependency of the image build job22:41
corvusclarkb: i think it should end up in the inventory?  but not in this case since there was no dependency23:01
corvusrather, not in the error case23:01
corvusclarkb: yeah, here it is from the previous correct check run:
corvussee "artifacts:"23:02
*** hwoarang has quit IRC23:08
*** rcernin has quit IRC23:11
*** hwoarang has joined #openstack-infra23:14
*** slaweq has quit IRC23:24
*** jamesmcarthur has joined #openstack-infra23:33
*** jamesmcarthur has quit IRC23:37

