Wednesday, 2023-08-30

opendevreviewDale Smith proposed openstack/project-config master: Add magnum-capi-helm-charts to Magnum project  https://review.opendev.org/c/openstack/project-config/+/89311700:47
opendevreviewDale Smith proposed openstack/project-config master: Add magnum-capi-helm-charts to Magnum project  https://review.opendev.org/c/openstack/project-config/+/89311700:52
opendevreviewDale Smith proposed openstack/project-config master: Add magnum-capi-helm-charts to Magnum project  https://review.opendev.org/c/openstack/project-config/+/89311701:53
*** TheMaster is now known as Unit19310:00
opendevreviewDr. Jens Harbott proposed openstack/project-config master: Unpause image uploads for rax-iad part 2  https://review.opendev.org/c/openstack/project-config/+/89314510:36
fricklerclarkb: corvus: ^^ according to the nodepool docs, this is per provider, so should look like this? https://zuul-ci.org/docs/nodepool/latest/configuration.html#attr-providers.max-concurrency10:36
fricklerthis would avoid the issue of slowing down uploads for other providers10:37
opendevreviewMerged openstack/project-config master: Add magnum-capi-helm-charts to Magnum project  https://review.opendev.org/c/openstack/project-config/+/89311711:56
*** d34dh0r5- is now known as d34dh0r5312:14
corvusfrickler: that just affects node requests on launchers; you want `--upload-workers` command line argument for the builder13:30
opendevreviewLukas Kranz proposed zuul/zuul-jobs master: prepare-workspace-git: Add ability to define synced pojects  https://review.opendev.org/c/zuul/zuul-jobs/+/88791713:53
opendevreviewLukas Kranz proposed zuul/zuul-jobs master: prepare-workspace-git: Add ability to define synced pojects  https://review.opendev.org/c/zuul/zuul-jobs/+/88791713:57
opendevreviewMaksim Malchuk proposed openstack/diskimage-builder master: Fix and issue with wait_for  https://review.opendev.org/c/openstack/diskimage-builder/+/89319614:32
fungijust a heads up, i'm in and out doing storm prep and errands, but should be around more during my afternoon15:05
clarkbfungi: its expected to hit you tomorrow night?15:12
fungirain and wind are likely to pick up around 5pm local time here, but yeah if the eye regains coherence on the way out to the atlantic that will be tomorrow15:13
fungiright now the eye is projected to pass south of us, but it's hard to track/predict accurately once it's over land15:13
clarkbhopefully it doesn't have too big of an impact. That asid it looks like its already creating massive problems in florida15:14
fungithe main thing we have to keep an eye on is wind-driven surge, which will depend a lot on wind direction (in turn depending on where the eye reappears) and how timing coincides with the tides15:15
funginorth carolina has a recently built out a really great flood inundation mapping and prediction network though, very glad that's a thing now: https://fiman.nc.gov/#15:16
fungiand in the past few months they added a gauge a few blocks from our house, so even better15:17
clarkbthats neat. They've even built it up far inland. I guess river flooding is an issue too15:19
fungiyes, the topology in nc includes a mountain range and coastal plane. the eastern continental divide passes through the west end of the state, and so everything that falls from the sky in the state flows this direction15:27
fungier, coastal plain15:27
fungithere are a variety of flood risks across the state, whether it's flash flooding in valleys, poor drainage in low-lying areas, or wind-driven surges on the shore15:29
fungistepping out again for a bit but should be back in an hour or so15:31
clarkbI'm going to approve https://review.opendev.org/c/opendev/system-config/+/892701. That image was primarily created for gitea on k8s which means it isn't used in production today though it will trigger a infra-prod-service-gitea run which should noop15:36
opendevreviewMerged opendev/system-config master: Update jinja-init image to bookworm  https://review.opendev.org/c/opendev/system-config/+/89270116:13
*** ralonsoh is now known as ralonsoh_ooo16:20
clarkbas expected the gitea job ran but was quick and successful and the service is still reachable16:42
fungioh good17:06
fungi(back now btw)17:06
TheJuliaGreetings folks, can I get a hold added for job name "ironic-tempest-ipa-partition-uefi-pxe-grub2" ? Thanks!17:36
clarkbI was about to say sure. Then ssh failed because I haven't loaded keys yet /me looks for keys17:40
clarkbTheJulia: it wouldn't let me create a hold without setting a projcet name so I set it for ironic17:46
fungiyes, project name is required17:50
opendevreviewMerged openstack/diskimage-builder master: Fix and issue with wait_for  https://review.opendev.org/c/openstack/diskimage-builder/+/89319617:58
clarkbfungi: whats your read on gitea + bookworm ssh risk. Do you think we should hold a gitea and a gerrit node and have them replicate to each other to ensure that mina can talk to openssh 9.2 using an rsa key? similarly what about the gerrit + bookworm change that bumps us to java 17?18:00
fungii think i missed some of the nuances of that (though i did see it being discussed). can you resummarize the issue?18:02
clarkbfungi: for gitea we're upgrading bullseye to bookworm which bumps us from openssh 8.4 to 9.2. This crosses the 8.8 rsa with sha1 is bad/evil/disabled threshold. Historically Gerrit's MINA library struggled with this. So far we've only run into problems with MINA as a server, but there is potential that it will break as the client too (though they say they fixed both)18:03
clarkbIn our CI testing we do test that we can push to gitea but we use git + openssh not Gerrit + MINA18:04
TheJuliaclarkb: perfect, thanks!18:04
clarkbIf we do have problems we can switch to an ed25519 key18:04
clarkbThe gerrit image upgrade to bookworm also bumps us to java 17. Bookworm has java 11 if we want to separate the distro update and the java update but currently as written it does both together. I don't have a full grasp on the differences between java 11 and java 17. In theory GC will perform better. Gerrit says they fully support java 17 at this point too so it should be fine18:05
fungicould we test upgrading one gitea server and then take it down in haproxy if replication isn't working to it as expected?18:08
fungior are held nodes still an easier path for confirming?18:09
clarkbfungi: yes I think we can put all but one of the gitea servers in the emergency file list, land the change, check replication, remove the other giteas from emergency and wait for the daily run to upgrade them18:09
clarkbas an alternative to waiting for the daily run we can land another change to trigger the job or manually run the playbook from bridge18:09
fungii'd be fine with that. if all we're concerned with is impact to commit replication, then the problems that will briefly produce should be minimal18:12
fricklersounds good to me too18:34
fricklerfor nodepool, we have this variable here https://opendev.org/opendev/system-config/src/branch/master/playbooks/roles/nodepool-builder/defaults/main.yaml18:34
fricklerdo we want to change the default or just override the nodepool group_var in our inventory?18:35
Clark[m]I would override not change the default18:37
fricklerok so I added "nodepool_builder_upload_workers: 1" in group_vars/nodepool.yaml, did not commit yet, can someone watch whether that works as expected? I'll update the project-config patch next19:06
fungii can check back in on it in a bit19:07
fungii still haven't caught up from this morning's personal life interruptions, but do still hope to get the ticket opened with rackspace before the end of my day19:08
opendevreviewDr. Jens Harbott proposed openstack/project-config master: Unpause image uploads for rax-iad part 2  https://review.opendev.org/c/openstack/project-config/+/89314519:08
frickleronce the builders are running with 1 thread, you could merge ^^ then, I'll check back on it tomorrow19:09
TheJuliaoooh ahh, autohold appears ready. Who shall I send my pub key to?  Thanks in advance!19:45
fungiTheJulia: ooh, gimme19:46
fungiTheJulia: ssh root@213.32.74.1119:47
TheJuliamuahahahahahahah19:48
fungiworld domination is closer than ever19:49
TheJuliahey guys, you can reclaim that hold now, I've got what I needed! Thanks!20:21
fricklerTheJulia: done20:23
fricklercorvus: Clark[m]: nodepool-builder is running with 1 thread now on nb01+2, so 893145 should be good to go20:24
* frickler should really eod now20:24
corvuswhere's the change that changes the upload workers?20:46
fungicorvus: the local vars on the bridge were updated per 19:06z in scrollback20:48
corvusfricklerfungi Clark i think/hope there was a miscommunication above.  it looks like frickler was asking where to make the change for the number of upload workers and interpreted Clark's reponse as indicating that it should be made in the secret hostvars on bridge.  but this is not a secret, and the change should be made somewhere in the opendev/system-config repository.  it could either be made in the role definition itself (where frickler20:50
corvusoriginally pointed) since we want it to apply to all builders and we are the only users; or it could be made in the inventory files in the opendev/system-config repo.  but either way, it shouldn't be on bridge.20:50
fungimakes sense. i can push a change to add that to system-config and then we can pull out the entry on bridge once it merges20:51
fungiworking on that now20:51
corvushttps://opendev.org/opendev/system-config/src/branch/master/playbooks/roles/nodepool-builder/defaults/main.yaml (original location from frickler )20:52
corvusor something like https://opendev.org/opendev/system-config/src/branch/master/inventory/service/group_vars/nodepool-builder.yaml if we want to narrow the scope20:52
corvusor there's like 5 other places it could go :)20:53
opendevreviewMerged openstack/project-config master: Unpause image uploads for rax-iad part 2  https://review.opendev.org/c/openstack/project-config/+/89314520:54
corvusi would probably change the original location -- that seems less confusing to me...20:57
corvus(because in this particular case, overriding anywhere else would basically mean we set a default value we never use?)20:57
clarkbYou can set it when the role is included20:58
opendevreviewJeremy Stanley proposed opendev/system-config master: Temporarily limit node image upload concurrency  https://review.opendev.org/c/opendev/system-config/+/89328920:58
fungilike that? i'll work on opening the ticket now and amend that change with the id in place of the todo comment20:59
clarkbyes i think that will work, though corvus is suggesting we just change the role default20:59
clarkbits just weird to me to use a default like that. To me the default should be for an ideal or at least realistic state and we are overriding that for unexpected behavior20:59
corvusbut the role is "run the opendev nodepool builder"21:00
corvusso the "default" is really the "way we run the opendev nodepool builder"21:00
fungiboth arguments make sense to me. i don't really have a preference but happy to adjust the change to whatever consensus is reached21:00
clarkbya I'm happy to do it the way corvus suggests since that is the strongest opinion any of us have expressed21:01
fungiokay, i can switch the change to do that when i get the ticket open in a few minutes21:02
fungi"Image upload processing delay in IAD21:06
fungiFor the past month (since around the end of July), when uploading images to the Glance API for the IAD region, backend processing takes at least 30 minutes after the upload has completed until the uploaded image appears in the image list. Uploading the same image to the DFW or ORD regions only takes a few minutes at this stage, by comparison.21:06
fungiWorse, if we upload multiple images around the same time, the delay for any of them appearing in the image list appears to scale roughly linearly with the number of images uploaded, and so has been observed to exceed 5 hours in some cases.21:06
fungiThanks for looking into it!"21:07
fungidoes that seem to encapsulate the concern without getting too into the weeds?21:07
corvuseither way, it's an extra level of indirection which makes how we run the system a little less discoverable (or at least, prone to accidental misunderstanding)21:08
clarkbfungi: the only other thing is maybe mention the task system? when uploading images to the Glance API using the task system...21:08
fungido we explicitly invoke tasks?21:08
fungii should probably also say we're uploading vhd images?21:08
clarkbopenstacksdk/shade/whatever it is call now does21:08
corvus(heh, to be clear, the strength of my opinion on this is like 2 out of 10 -- but that may well still be the strongest :)21:08
fungithe "import" task i guess?21:08
clarkbmaybe? maybe its better to leave that out until they ask what secific apis are being used21:09
fungifeels odd to say we're using tasks but not say what tasks21:09
fungimaybe that just feels odd because i'm fuzzy on that part of the api21:09
clarkbeveryone is because it is undocumented :)21:10
fungibut if it's undocumented how do we know we're using it?21:10
fungianyway, the api response does mention the import task21:10
clarkbbecause its the rax image upload system. Glance added tasks just for rax and that is why it is undocumented because it wasn't a thing anyone else ever ended up using21:10
fungi"the IAD region, backend processing takes at least 30 minutes after the upload has completed until the uploaded image appears in the image list, a while after the import task returned by the image create call is showing an image ID."21:12
clarkblgtm21:12
fungihuh, that didn't copy all of what i thought i highlighted21:13
fungi"For the past month (since around the end of July), when uploading VHD images to the Glance API for the IAD region, backend processing takes at least 30 minutes after the upload has completed until the uploaded image appears in the image list, a while after the import task returned by the image create call is showing an image ID."21:13
fungianyway, that adds mention of vhd and tasks21:13
opendevreviewJeremy Stanley proposed opendev/system-config master: Temporarily limit node image upload concurrency  https://review.opendev.org/c/opendev/system-config/+/89328921:16
fungicorvus: clarkb: ^ how's that?21:16
clarkb+221:17
corvus+321:17
fungionce it merges i'll undo the similar addition on bridge21:18
opendevreviewClark Boylan proposed openstack/project-config master: Switch OpenStack's Zuul tenant to Ansible 8 by default  https://review.opendev.org/c/openstack/project-config/+/89329021:21
clarkbfrickler: ^ theres the change to merge on Monday.21:21
JayFI suddenly, for a completely unrelated reason, feel compelled to go run a test job on Ironic/bifrost ;)21:21
opendevreviewMerged opendev/system-config master: Temporarily limit node image upload concurrency  https://review.opendev.org/c/opendev/system-config/+/89328922:31
fungisince that ^ has merged, i undid the corresponding edit to /etc/ansible/hosts/group_vars/nodepool.yaml22:37
fungii didn't revert it since it was never committed to git on bridge anyway22:37
*** benj_0 is now known as benj_23:20

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!