Monday, 2023-03-27

*** benj_71 is now known as benj_702:41
opendevreviewIan Wienand proposed zuul/zuul-jobs master: promote-image-container: do not delete tags  https://review.opendev.org/c/zuul/zuul-jobs/+/87861204:42
opendevreviewIan Wienand proposed zuul/zuul-jobs master: [dnm] rough draft of deleting old quay tags  https://review.opendev.org/c/zuul/zuul-jobs/+/87861405:35
eanderssonIs github not mirror properly right now?05:56
eanderssonLast update is 3 days old from what I can see05:57
eanderssonoh probably related to the keys that github revoked05:57
fricklereandersson: infra-root: ack https://zuul.opendev.org/t/openstack/build/84a411295fae463fa58f676fd09d3da7 failing since friday06:56
fricklerhttps://opendev.org/openstack/project-config/src/branch/master/zuul.d/secrets.yaml#L64807:02
fricklerairship and starlingx seem to have that in every single repo ... :-(07:02
opendevreviewDr. Jens Harbott proposed openstack/project-config master: Update github ssh rsa hostkey for uploads  https://review.opendev.org/c/openstack/project-config/+/87861607:06
fricklerStorPool OpenStack CI also is a bit noisy07:07
*** jpena|off is now known as jpena07:42
amoralejhi, there is some issue with the rackspace mirror we use as source for centos-stream sync http://mirror.rackspace.com/centos-stream/SIGs/07:45
amoralejany idea about how can we report the issue or get in touch with rackspace mirrors admins?07:46
mnasiadkaSeems Gerrit has some issues, getting 503 on requests...07:54
fricklermnasiadka: yes, looking already07:56
frickleramoralej: this seems to be a neverending story. maybe something for redhat as a company to solve if they want to keep us supporting centos07:59
fricklermnasiadka: restarted gerrit, should be better for now08:13
opendevreviewMerged openstack/project-config master: Update github ssh rsa hostkey for uploads  https://review.opendev.org/c/openstack/project-config/+/87861608:40
fricklerinfra-root: ^^ not sure if someone would want to retrigger uploads or if we think having the next commit fix things would be good enough08:43
fricklerupload for the above merge itself succeeded https://zuul.opendev.org/t/openstack/build/28afc4daa44f447ebcc861e6b94beea408:49
opendevreviewThierry Carrez proposed opendev/irc-meetings master: Move Large Scale SIG EU+APAC meeting hour  https://review.opendev.org/c/opendev/irc-meetings/+/87863409:08
opendevreviewMerged opendev/irc-meetings master: Move Large Scale SIG EU+APAC meeting hour  https://review.opendev.org/c/opendev/irc-meetings/+/87863409:31
*** amoralej is now known as amoralej|lunch12:16
fricklerfyi, github jobs are failing again now, but with a different error. apparently gh is having a major outage13:03
fricklernot much green on https://www.githubstatus.com/13:04
fungiouch13:09
*** amoralej|lunch is now known as amoralej13:20
*** ykarel_ is now known as ykarel13:21
fungigithub marked it resolved about 30 minutes ago: "an infrastructure change that has been rolled back"13:58
corvusfungi: where did you find the logs for the strange node request failure?14:08
fungicorvus: nl03:/var/log/nodepool/launcher-debug.log.2023-03-23_0614:08
fungii was going to start looking at graphs for the zk servers when i get a break between ptg sessions14:10
corvusand we're looking at 300-002081318114:10
fungicorrect14:10
funginl04 too it first, gave up trying to boot something and declined, then nl03 too it next and that happened14:10
fungis/too/took/14:11
corvusi think i see the bug14:14
fungioh! that's fast14:15
fungisomething new with the openstack statemachine driver rework?14:15
corvusfungi: if you note, the osuosl thread is locking the request only 3 microseconds after the linaro thread unlocked it.  that's certainly close enough for them to race these two lines: https://opendev.org/zuul/nodepool/src/branch/master/nodepool/zk/zookeeper.py#L2174-L217514:17
corvuslinaro unlocks; osulosl locks; linaro sets the lock attribute to None.; osuosl fails to unlock because the lock attribute is none14:18
fungioh, right14:18
corvusthis is a fairly ancient bug -- except that it only really became an issue when we started caching node request objects14:18
corvus(when originally written, each thread would have gotten its own request object)14:19
corvusi can work on a fix today14:19
fungithanks, it's certainly not debilitating, this is the first one i noticed and it eventually got dequeued naturally a few days later when someone uploaded a new patchset, just wanted to try to figure out what happened14:21
fungilikely there have been more and we've never noticed14:21
corvusyeah, stuck node requests are infrequent, but we've certainly seen them.14:28
opendevreviewClark Boylan proposed opendev/system-config master: Update Gerrit 3.7 to 3.7.2  https://review.opendev.org/c/opendev/system-config/+/87869515:14
clarkbinfra-root ^ as noted in the commit message this is largely bookkeeping but upstream gerrit did make a release to fix that related changes issue so may as well reflect we're pulling that locally15:14
clarkbinfra-root also I shutdown gitea services on gitea01-04 on friday. I don't think we've had any trouble with gitea since? I'll look at deleting the servers in that case just as soon as the PTG tc meeting finishes later this morning15:17
clarkbplease say something if you have any concerns with that plan15:17
clarkbgitea01 in partiular was our backed up server. But I ported the db out of it to the new servers and gitea09 is backed up now so we shouldhave continuity of the database . We can also refer back to the gitea01 backups as they don't disappear15:21
opendevreviewAlfredo Moralejo proposed opendev/system-config master: Use Red Hat managed mirror to sync CentOS Stream content  https://review.opendev.org/c/opendev/system-config/+/87870116:26
*** jpena is now known as jpena|off16:33
*** amoralej is now known as amoralej|off16:34
clarkbinfra-root I'll proceed with gitea01-04 deletions now18:08
clarkbdo we think we need to do anything else with gitea01 prior to its deletion given its special status?18:08
funginothing i can think of18:12
clarkbI guess what I can do is delete the server but not the backing disk volume18:15
clarkband make note of that volume being the old gitea01 volume and we can clean it up even later18:15
clarkbsince the main concern here is the running instance and its associated resource (the volume is one of those resources but IPs and instance quota are the bigger issue I think)18:15
clarkbok I think I can delete gitea01.opendev.org then `openstack volume set --name gitea01-old-boot-volume` on its boot device to make that clear in the openstack api state18:21
clarkbI'll proceed with that in a few minutes if no one objects (gitea02-04 are gone as are their volumes)18:22
clarkbok that appears to have worked. gitea01-04 are deleted. gitea01's boot disk remains and I set its name so that its clear what that is/was in a volume listing18:28
clarkb#status log Gitea01-04 have been deleted. Gitea is entirely running off of the new replacement servers at this point.18:28
opendevstatusclarkb: finished logging18:29
opendevreviewClark Boylan proposed opendev/zone-opendev.org master: Remove gitea01-04 server DNS records  https://review.opendev.org/c/opendev/zone-opendev.org/+/87871018:29
clarkband that change will clean up dns to match18:29
clarkbnow to push and hold a node for gitea 1.19.0. Historically we've often not upgraded to the .0 releases though18:31
opendevreviewClark Boylan proposed opendev/system-config master: DNM intentional gitea failure to hold a node  https://review.opendev.org/c/opendev/system-config/+/84818118:33
clarkbI've put a hold in place for ^18:35
clarkbfungi: frickler do we need to send a service-announce email indicating projects with github replication may need to update the ssh host key?18:56
clarkbildikov: ^ fyi frickler  found evidence that airship and starlingx have this issue18:56
ildikovclarkb: thank you for the heads up!18:57
clarkbhrm though I'm not finding it. I'm also probably not looking in the right way or maybe it call got fixed18:57
clarkbis it possible airship and starlingx use the openstack account and jobs maybe? so fixing openstack fixed it for them too18:58
clarkbthat may be it18:58
ildikovI'm not 100% sure, but I think that's more than possible18:59
clarkbhttps://104.130.127.14:3081/opendev/system-config is running gitea 1.19.0 and looks good19:55
clarkbthe main feature added by this release (actions) is something I have disabled in the change19:56
clarkbI suspect we're mostly going to need to check for regressions of existing behavior19:57
clarkbwith the horizon thing sorted out my tuesday ptg schedule is looking fairly light. I'm up for doing the meeting if others are20:02
clarkblet me know and I'll put an agenda together later today if we decide to have one20:02
fungii can take it or leave it20:36
ianwclarkb: with 878695 i think we can also revert download-commands to 3.7.2 tag -- mind if i push an update with that?20:42
clarkbianw: its already on the 3.7.2 tag?20:44
clarkboh wait20:44
clarkbI'm looking at my old checkout not for that change. when did we flip it to master?20:44
clarkbbut ya 3.7.2 and master are the same so thats a good cleanup20:45
clarkbianw: feel free20:45
ianwyeah i had to flip it to rebuild for https://review.opendev.org/c/opendev/system-config/+/87804220:46
opendevreviewIan Wienand proposed opendev/system-config master: Update Gerrit 3.7 to 3.7.2  https://review.opendev.org/c/opendev/system-config/+/87869520:48
ianwthere's no changes since https://23.253.56.187 deployed, so i think that's still a vaild tester20:50
ianwvalid even20:50
clarkbianw: thats true for gerrit itself as well (no changes since that deployment)?22:41
clarkbianw: any opinion on having a meeting tomorrow?22:54
clarkbI suspect fungi and frickler are most involved in the ptg and would be happy to skip (fungi indicated this earlier actually)22:54
clarkbI'm inclined to skip if you don't think there is an urgent reason to have it22:54
opendevreviewMerged opendev/system-config master: Update Gerrit 3.7 to 3.7.2  https://review.opendev.org/c/opendev/system-config/+/87869523:07
ianwok, happy to skip23:17
ianwi think we have enough going on to keep us busy :)23:17
* Clark[m] jumped onto the laptop to redo python3.11 things to match the desktop and haven't loaded keys. I'll go ahead and send a note about skipping this week23:40
Clark[m]I should probably make a little anisble playbook to add a new python version and install the tools and update the symlinks23:41
fungii have a bash script, which i guess is approximately the same thing23:42
Clark[m]ok I sent an email to the list to make it official23:47
fungithanks!23:49
*** dhill is now known as Guest907123:51
Clark[m]tomorrow I should run the zuul test suite and see if it goes zoom zoom under py311 compared to p310 since my local 311 installation has the extra cpu flags enabled23:51

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!