Friday, 2024-01-19

clarkbthe emails about needing 2fa for the old github account for review-dev make me wonder if we should just kill that account?00:16
fungior just keep ignoring it, and add 2fa if we ever need to log into it00:31
fricklerthis zuul config error seems to say that we don't actually check all branches anymore. but I also really would like to see us have a way to get rid of these errors before adding more repos from github: Zuul encountered an error while accessing the repo sqlalchemy/sqlalchemy. The error was:   Will not fetch project branches as read-only is set05:41
fricklercorvus: ^^ do you have some context for that?05:41
*** tosky_ is now known as tosky12:50
fungifrickler: i don't have an answer, but the earliest version of that error was introduced more than two years ago with the initial patchset of https://review.opendev.org/816807 and code comments around the current version of the two places it can be raised (as either LookupError or RuntimeError) indicate we should expect that when the scheduler either hasn't attempted to fetch branches from a14:07
fungiproject yet or has tried and got an error from the remote when it did14:07
fungiso my guess is that something changed permissions in the sqlalchemy/sqlalchemy repo or there was a transient github api error the last time a scheduler tried to get its branches14:08
fricklerfungi: well the error seems to show up for all repos we use from github, I just used that specific one as example. the error have also been persistenly listed for at least some months, so I don't think it is any transient behaviour14:38
frickleras mentioned in the review adding eventlet, it might be interesting to see if the error also appears if we actually add the zuul app to it on github14:41
fricklerone could also check whether the error actually appears for all github projects or just some subset14:41
corvusthe error means that the scheduler did not put all of the expected information about the project into the zk cache and the web server, which is responsible for producing those errors, refuses to do that work because it's not its job.  it's certainly not working as designed, but exactly why will need some investigation.  the scheduler may still be querying all the data, and whatever the web server thinks is missing might also cause the scheduler14:59
corvusto query too often.14:59
clarkbfrickler: re https://review.opendev.org/c/openstack/project-config/+/906071 maybe lets land that and see if we get a different result from zuul when adding a new project to a running zuul vs trying to load all the projects when restarting zuul?16:17
clarkbalso I'm hesitant to require the github app because we know the permissions are overly aggressive16:18
fricklerclarkb: sure, I'm not against testing things. I also tried looking at the code a bit but it seems I still lack some basic understanding of how this all works, like how does a runtime error end up in the config error list? and how is that list persisted and under what condition would such an error ever get removed from it again?16:22
clarkbits a config error because it hasn't been able to load the configs from those branches16:24
clarkbwhich is actually I think ok here because we don't actually load configs from those branches but zuul will still check first I guess16:24
clarkbI think to get the error to go away we have to force zuul to attempt to refetch the project info from github. I don't know what triggers that it is possible a restart is required16:25
clarkbif these projects were expected to provide their own job configs then this would be far more problematic but since they aren't I don't think anyone notices16:26
fricklerwell we have "include: []" for all github projects, so there sure shouldn't be any attempt to load any config from those repos?16:40
fungii have a vague recollection there was a reason to not try to fetch configs if the tenant config says not to load any from the project. could it be that the scheduler was smart enough not to try, but then zuul-web is confused by those branches not being in the cache?16:41
clarkbya I suppose that could be possible but I would need to go and reread the code16:59
fricklerok, I looked at debug.log on zuul02 and there was a tenant reconfiguration event on openstack this morning and github returned just 401 for all queries like this: 2024-01-19 06:20:54,634 DEBUG zuul.GithubRequest: GET https://api.github.com/repos/sqlalchemy/alembic/branches?per_page=100 result: 401, size: 80, duration: 6517:00
frickleralso 2024-01-19 06:20:57,358 INFO zuul.GithubConnection.GithubClientManager: No installation ID available for project sqlalchemy/sqlalchemy17:05
fungihttps://docs.github.com/en/rest/branches/branches indicates that method should be available anonymously, so not sure why it would result in a 401 (unauthorized) response unless that's github rate-limiting kicking in17:06
frickleryes, I was just testing that it works if I try it manually17:07
fungii'm able to request the same url and get reasonable data back rather than a 40117:07
fungiright17:07
fricklerhmm, I tried to trigger the rate limit, which is 60 reqs/h, and if I do that, the response is a 403, not 40117:10
fricklerso it looks like zuul may actually be using some kind of auth that github however considers invalid?17:11
corvuslikely the api token since https://review.opendev.org/79468817:12
fungiso maybe our api token expired or was revoked?17:25
clarkbI think you can list api tokens in github if you login somewhere17:25
clarkbthat might tell us17:25
clarkbI'm able to keep an eye on the gitea 1.21.4 upgrade today if we think it is safe to do so https://review.opendev.org/c/opendev/system-config/+/90606217:39
clarkbthe review.o.o cert check did pass last night so ya I guess jsut a race between systems17:39
fricklero.k., I confirmed that when using the api_token that is in the zuul.conf on zuul02, github returns a 401. I also double checked my command with a personal token, that gives a 200. so the question is where did that token come from? I don't see any token in either the opendevadmin or openstackadmin account19:42
clarkbI'm not sure. I suspect corvus probably set it up and may recall. However isn't the username part of the connection details for that otken?19:44
clarkbbut ya maybe github removed it for some reason and we need to make a new oen19:44
fungigit history should at least narrow it down to a particular timeframe19:44
fungias far as when it was added, in which case there may be contemporary discussion in logs regarding how19:45
fricklerhmm, my firefox history found https://review.opendev.org/c/zuul/zuul/+/794688 , but that is very recent. zuul.conf is pretty exactly one year older19:49
fricklerhad to go very far back and found this https://meetings.opendev.org/irclogs/%23opendev/%23opendev.2021-06-03.log.html#t2021-06-03T23:07:5919:54
fricklerI can try to generate a fresh one on monday or maybe tomorrow, off for now19:56
fungisorry, i meant git blame in the group_vars on bridge, which shows api_token was added to the zuul-scheduler group by a commit made 2021-06-0319:57
fungibut yes, that coincides with the irc log you found19:57
opendevreviewMerged opendev/system-config master: Update gitea to 1.21.4  https://review.opendev.org/c/opendev/system-config/+/90606220:01
clarkbthat should start deploying in a few minutes. The zuul and eavesdrop job should got first20:13
clarkball of the giteas appear upgraded20:28
clarkband the infra-prod-service-gitea job was successful20:28
clarkbI have successfully cloned system-config too. This is looking good tome20:29

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!