Sunday, 2020-03-29

*** tosky has quit IRC00:00
openstackgerritMohammed Naser proposed zuul/zuul-jobs master: chart-testing-lint-single: new job  https://review.opendev.org/71566500:03
*** factor has joined #opendev00:43
mnaserok, so: i feel awful about the extra (small) workload as we're doing more opendev-y things but i'm wondering what's the "delete a project" story like right now00:53
mnaserbackground: i just realized that everything about helm and their charts are really meant to be a "repository" of charts, and potentially having "one chart" per repo is something that is a little weird in the entire ecosystem00:53
mnaserso i'm just wondering if it makes sense to just create a vexxhost/helm-charts and manage everything there in a monorepo, which absolutely sucks, but maybe it might be "the way" unfortunately00:54
fungithe delete story right now is to delete the content and set the acl to read-only00:56
fungiso that it doesn't accept change proposals00:56
mnaserfungi: ah, i see, so still leaves some "artifacts" at the moment.00:58
mnaseri am trying to look at ways at making a chart-per-repo happen right now still.  it's so silly.00:58
mnaserjust a heads-up, the acl issue is there and i don't have access, so we may not want to land any new project creations -- https://review.opendev.org/#/admin/projects/vexxhost/smokeping_prober-helm,access01:18
fungiyeah, i'm going to see what removing it from the jeepyb cache does01:36
fungiokay, i've backed up and then manually removed the entry for that project from /opt/lib/jeepyb/project.cache01:55
fungiit looks like it thought it had already created the project earlier and so wasn't trying to update the acl once we fixed things01:56
fungiwhich didn't seem to help, i'll try to take a closer look at the logs tomorrow03:57
*** diablo_rojo has quit IRC06:40
*** DSpider has joined #opendev07:43
*** tosky has joined #opendev10:52
*** DSpider has quit IRC13:52
fungitesting a theory. reviewing the acl update code in manage_projects.py it looks like a nonexistent acl sha in the cache file and an acl sha seems to result in it "matching" in this case and getting skipped, so i've manually inserted an incorrect sha for this repo to see what it does14:02
fungithere may be a problem with the acl cache14:03
fungisomehow this was matching even though there was no acl sha recorded in the project cache for that repo: https://opendev.org/opendev/jeepyb/src/branch/master/jeepyb/cmd/manage_projects.py#L55214:08
fungisince there was no acl sha in the cache, project_cache[project].get('acl-sha') returns none14:08
fungiwhich means acl_cache.get(acl_config) must also be returning none14:08
openstackgerritMonty Taylor proposed opendev/system-config master: Really bindmount acls  https://review.opendev.org/71569714:10
mordredfungi: ^^14:10
mordredyeah - our fix for that wasn't quite complete14:11
fungiaha!14:11
mordredso - yes - you were definitely on the right track14:11
fungii'll undo my local edit to the project cache so we can run that through its paces14:11
mordredcool14:11
mordredalso - fwiw - there are errors in the ansible log related to first_found and our iptables rules14:12
fungiwe already seem to set errors='ignore' on that task14:19
mordred*awesome*14:19
fungiis it ignoring our ignore?14:19
mordredmaybe?14:20
mordredmaybe it's still warning us regardless14:20
fungiwith a confusing error saying to add a parameter which we already add14:20
mordredyeah14:26
*** DSpider has joined #opendev14:27
openstackgerritMerged opendev/system-config master: Really bindmount acls  https://review.opendev.org/71569715:38
openstackgerritMohammed Naser proposed zuul/zuul-jobs master: helm: collect kubernetes logs in post  https://review.opendev.org/71570915:40
AJaegerhttps://review.opendev.org/#/admin/projects/vexxhost/smokeping_prober-helm,access is not updated, did ansible run with the fix?16:40
fungii've been watching, last manage-projects run occurred at 16:04 utc, which may have been too soon to incorporate the new docker config16:40
AJaegerchange merged at 15:38 utc AFAIU, so shouldn't it be there? Thanks for watching - and no urgency from my part ;)16:42
fungii'm not clear on whether it needs to be merged before the start of the ansible run, and when ansible started. i'll check the cron log16:43
fungi--- begin run @ 2020-03-29T15:30:01+00:00 ---16:44
fungi--- begin run @ 2020-03-29T16:30:01+00:00 ---16:44
fungiso it merged after the ansible run which was responsible for the 16:04 manage-projects16:45
fungiexpecting around 15 minutes before we'll know if this has solved things16:48
openstackgerritMonty Taylor proposed opendev/system-config master: Add constraints support to python-builder  https://review.opendev.org/71397216:57
openstackgerritMonty Taylor proposed opendev/system-config master: Fix siblings support in python-builder  https://review.opendev.org/71571716:57
mordredmnaser: ^^ I thnik those two will be needed for the general container building case to work properly - would love double-checking my logic there16:58
fungiit's underway now17:03
fungishould know in another minute or two17:04
mordredwoot17:07
fungiit seems like it's taking a good deal longer than before17:07
mordredmaybe that's a good sign17:07
*** elod has quit IRC17:08
fungimaybe. does appear to be cloning every repo though17:09
mordredfungi: that seems a bit extreme - where is it cloning them to?17:12
mordred(wondering if we missed a bindmount)17:12
mordredfungi: looks like it's cloning to /opt/lib/jeepyb like it's supposed to17:14
fungibut should it do that every time it runs?17:14
mordredhrm17:14
mordredit does't seem like /opt/lib/jeepyb has copies of all of these repos17:14
fungii thought the point of the git cache was that it would reuse those on subsequent runs and just fetch updates for refs/meta/config17:15
mordredyeah17:15
mnasermordred: ok neat, i'm hoping to have time to get back on hacking using those images soon (hopefully).17:15
mordredbut check out /opt/lib/jeepyb/openstack17:15
fungibut yes, they seem to get cleared out between each run (or at least they were all empty when i looked earlier)17:15
mordredyeah17:15
mordredmnaser: same17:15
fungiit's working on cloning nova now17:15
fungibeen cloning nova for nearly two minutes, which is unsurprising17:16
mordredyeah. Im a little surprised at the behavior tbh17:17
mordredI wish we had a log of _why_ we think we need to clone it :)17:17
mordredok - first step of process_acls is to do this17:18
mordredbut this now makes me wonder if the acl_cache erroneuously has None entries for the acl_sha for a bunch of things17:18
mordredfungi: acl-sha seems to be mostly null in the project cache17:19
mordredso I think we might, for this run, be re-pulling all of the acls so that we can verify an acl match and then re-write a new sha to the cache file17:20
fungiquite possible if it cleared them all out because it was previously run with an empty acl condfig17:20
mordredyeah17:20
mordredso this run might take a minute17:21
mordred:)17:21
fungihttps://review.opendev.org/#/admin/projects/vexxhost/smokeping_prober-helm,access17:35
fungiyay!17:35
fungilooking good17:35
fungishould i go ahead and approve 713809 now or is there anything else we need to check first?17:36
fungi(re)approve i mean17:37
mordredfungi: I think go for it17:38
fungibam17:39
openstackgerritMerged openstack/project-config master: Added new project openstack-tempest-skiplist  https://review.opendev.org/71380917:46
mordredfungi: it should be noted that github mirror creation is also not running17:50
fungimaybe this is the time to start considering in earnest migrating the openstack namespace to git ref replication jobs17:51
fungior at least take a step toward it. in that scenario, the github caretakers for openstack would need to manually create the repo in gh anyway17:52
fungibut being able to turn off github integration in jeepyb and gerrit is becoming increasingly attractive17:52
*** elod has joined #opendev17:53
fungianalyzing the manage-projects log, typical runs take 2 seconds if a no-op and 5 seconds if there's a new project created. that last run took just over 33 minutes18:06
fungihopefully the next one will be back in the 2-5 second range18:06
mordredyeah18:07
mordred(to all of the above)18:09
mnaseryay, i can merge my own code again18:20
mnaser:p18:20
mnaserfungi, mordred: thanks for the work on this18:20
fungimnaser: thanks for the patience!18:34
fungi18:34 completed in just under 11 seconds18:38
fungii guess because it also pulled a remote repo for import18:38
fungihttps://opendev.org/openstack/openstack-tempest-skiplist exists but https://review.opendev.org/#/admin/projects/openstack/openstack-tempest-skiplist,access does not18:39
fungiException: Gerrit error executing gerrit ls-groups -v -q "openstack-tempest-skiplist-core"18:41
fungilooks like new group creation may not be working yet18:41
fungifrom memory, the order of operations is that manage-projects creates the group, then tries to poll multiple times for the group because gerrit returns from the api call before the group exists18:56
fungiand manage-projects needs to obtain the group uuid to add to the config18:57
fungi(in refs/meta/config of the repo)18:57
clarkbit may not poll anymore?18:57
clarkbthat was something we changed ti make this testable, removed direct db access18:57
fungiwell, that's the polling which raiswd the exception18:57
clarkbah ok18:57
fungioh, wait18:58
fungiyou may be right18:58
fungiit was checking for group creation via db access, and this is the uuid lookup which raised an exception18:58
fungiso maybe we just need to retry that?18:58
clarkbit shouldnt do db access anymore18:58
fungiyeah, which may be why we're now losing this race18:59
clarkboraybe thats what you meant? and ya retrying there seems reasonable18:59
fungithough https://review.opendev.org/#/admin/groups/?filter=openstack-tempest-skiplist-core still returns nothing19:00
fungiso maybe the group creation command failed, or never happened?19:00
fungiwe don't seem to log it19:00
fungiahh, this is within the retry19:10
fungifor x in range(retries):19:10
fungigroup_list = list(gerrit.listGroup(group, verbose=True))19:11
fungibut gerritlib is raising an exception in there19:11
fungiso the loop bails on the first iteration19:11
mordredis this one where wrapping the call in list() may have broken it?19:20
mordredoh - I guess not if we get that exception message19:21
openstackgerritMonty Taylor proposed opendev/jeepyb master: Trap for exception in listGroup  https://review.opendev.org/71572319:23
mordredfungi, clarkb : what about something like that ^^?19:23
fungiyeah, sorry, got sidetracked by tasty thai takeout leftovers19:23
mordredmmm. tasty thai19:23
fungimordred: but the group was never created19:23
fungiso i don't think we've (yet) ruled out a problem in the group creation command itself19:24
mordredah - nod19:24
fungialso i'm frosting the chocolate orange cake i baked earlier, so will still be a few minites19:25
fungiminutes19:25
mordrednod19:25
fungiso this is happening in process_acls() and we call create_groups_file() where the exception is getting raised before we push_acl_config() which is, i think, what should trigger gerrit to create the groups, right?19:37
fungidid something change around that recently?19:38
clarkbno, you have to create the group before pushing the file because the file has to have the uuid in it19:38
fungiahh, yeah, so there must be an api call somewhere doing that step19:42
fungioh, it's in get_group_uuid() not _get_group_uuid()19:43
fungii misread the backtrace and missed the wrapper19:43
fungiso looking in get_group_uuid() the gerritlib exception is happening before we ever get to the gerrit.createGroup() call19:45
fungimaybe the exception is new? the flow there seems to imply that we expect _get_group_uuid() to return a falsey value if the group doesn't exist, rather than raise an exception19:46
clarkbya maybe a python3 related change?19:50
fungishould we fix this in gerritlib or work around it by catching the exception in jeepyb?20:10
fungii guess if we do it in gerritlib we need another release, right?20:10
clarkbya20:11
clarkbwith the last thing wefixed it in both places and can remove the jeepyb fix one arelease for gerritlib happens20:11
openstackgerritJeremy Stanley proposed opendev/jeepyb master: Catch exceptions when checking for groups  https://review.opendev.org/71572620:45
fungimordred: clarkb: what do you think of trying that ^ mitigation, and then we correct in gerritlib and revert if it works20:46
clarkbfungi: ya that seems fine. Im about to hop on the bike so in a bad spot to record it in gerrit20:47
openstackgerritMohammed Naser proposed zuul/zuul-jobs master: Revert "Revert "Extract pep8 messages for inline comments""  https://review.opendev.org/71572721:09
*** hashar has joined #opendev21:32
*** hashar has quit IRC21:39
openstackgerritIan Wienand proposed openstack/diskimage-builder master: Add Fedora 31 support and test jobs  https://review.opendev.org/70841621:47
openstackgerritMohammed Naser proposed openstack/project-config master: DNM: making sure pep8 inline comments don't break  https://review.opendev.org/71573321:52
openstackgerritPiotr Kopec proposed openstack/project-config master: Add new project and repository for tripleo-compute-extras  https://review.opendev.org/71573421:53
openstackgerritIan Wienand proposed opendev/base-jobs master: Revert "virtualenv-config: add to base pre playbook"  https://review.opendev.org/71573521:56
*** DSpider has quit IRC21:58
openstackgerritMohammed Naser proposed zuul/zuul-jobs master: Revert "Revert "Extract pep8 messages for inline comments""  https://review.opendev.org/71572722:03
openstackgerritPiotr Kopec proposed openstack/project-config master: Add new project and repository for tripleo-compute-extras  https://review.opendev.org/71573422:12
*** smcginnis has quit IRC22:19
*** smcginnis has joined #opendev22:19
fungii've proxy +2'd 715726 on clarkb's behalf and approved it22:24
fungiwill keep an eye on manage-projects to see if that gets us the rest of the way22:24
openstackgerritPiotr Kopec proposed openstack/project-config master: Add new project and repository for tripleo-compute-extras  https://review.opendev.org/71573422:27
*** tosky has quit IRC22:54
openstackgerritMohammed Naser proposed zuul/zuul-jobs master: Revert "Revert "Extract pep8 messages for inline comments""  https://review.opendev.org/71572723:08
openstackgerritMerged opendev/jeepyb master: Catch exceptions when checking for groups  https://review.opendev.org/71572623:14
openstackgerritMohammed Naser proposed zuul/zuul-jobs master: Revert "Revert "Extract pep8 messages for inline comments""  https://review.opendev.org/71572723:22
mordredmnaser: your dnm testing patch didn't produce any pep8 errors :)23:45
mnasermordred: not sure if my message made it but it was hopefully meant to pass and catch the warning failure23:56

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!