Tuesday, 2024-04-23

jrosser_hmm glance really is not happy with these capi jobs https://zuul.opendev.org/t/openstack/build/61ed8fbc30db4a64a4a70b3b26266e27/log/job-output.txt#2735607:26
jrosser_write error https://zuul.opendev.org/t/openstack/build/61ed8fbc30db4a64a4a70b3b26266e27/log/logs/openstack/aio1-glance-container-7f359023/glance-api.service.journal-23-38-40.log.txt#205207:28
noonedeadpunkbut they are for previous one?07:30
noonedeadpunkthat;'s quite weird07:30
jrosser_its doing it a lot07:51
jrosser_and is wierd as all i did was make a patch to have several jobs running with one variable different between then07:51
jrosser_but yesterday merging the main patch to os_magnum just want through with no problems07:51
jrosser_*went07:51
noonedeadpunkit really did07:52
jrosser_i think i still have an aio here so i can try one of those failing versions07:52
noonedeadpunkbut also https://zuul.opendev.org/t/openstack/build/4f26b36e71f945afa131013bf7792ca6 jsut passed as well07:52
jrosser_but somehow feels like CI node issue07:52
noonedeadpunk2 times in a row?07:53
jrosser_one obvious thing is that my job launches a bunch of these in parallel07:53
jrosser_but yesterdays patch only runs one07:53
jrosser_this disk looks huge so it's doesnt seem out of space https://zuul.opendev.org/t/openstack/build/61ed8fbc30db4a64a4a70b3b26266e27/log/logs/openstack/instance-info/host_system_info_22-25-57.log.txt#652507:54
jrosser_looks like 160G root disk on these07:55
noonedeadpunkoh, btw, we were transfered https://opendev.org/openstack/ansible-role-frrouting08:00
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible master: Reflect frrouting role new place  https://review.opendev.org/c/openstack/openstack-ansible/+/91671908:02
noonedeadpunkthere were couple of open patches there08:03
noonedeadpunkhttps://review.opendev.org/c/openstack/ansible-role-frrouting/+/91037308:03
noonedeadpunkand, it has molecule (was even operational)08:03
noonedeadpunkso might be good place to start experiments on how to make "consistent" functional jobs for roles08:04
jrosser_why does that need 2 nodes08:24
jrosser_with the docker driver for molecule doesnt it just make two containers?08:24
noonedeadpunkit does08:25
noonedeadpunkit was historically without molecule08:25
noonedeadpunkjust naitve 2 nodes job08:26
noonedeadpunkI guess this can be dropped now actually08:26
jrosser_so interestingly glance upload of one of the other k8s images just worked first time in my AIO08:28
jrosser_interesting discussion on the ML about branchless SDK09:07
jrosser_i wondder how much better/worse the 0.99/1.x transition of SDK vs. the ansible collection would have been in that case09:08
noonedeadpunkhuh? I don't see this one at all10:00
noonedeadpunkah, branchless detection thread10:00
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_skyline master: Add ssl configuration to DB connection string  https://review.opendev.org/c/openstack/openstack-ansible-os_skyline/+/91675410:09
noonedeadpunkdoh, seems we have circular dependency for skyline between https://review.opendev.org/c/openstack/openstack-ansible-os_skyline/+/916754/ and https://review.opendev.org/c/openstack/openstack-ansible-os_skyline/+/91237011:36
jrosser_hmm i guess we should just squash those together12:57
noonedeadpunkyeah13:07
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_skyline master: Add EL distro support and ssl configuration for DB connection  https://review.opendev.org/c/openstack/openstack-ansible-os_skyline/+/91237013:08
jrosser_interesting https://opendev.org/vexxhost/openstack-operator/src/branch/master/openstack_operator/templates/operator/uwsgidefaultconfig.yml.j2#L3013:30
jrosser_relatedly https://stackoverflow.com/questions/36156887/uwsgi-raises-oserror-write-error-during-large-request13:31
opendevreviewJonathan Rosser proposed openstack/ansible-role-uwsgi master: Work around OSError during large transfers  https://review.opendev.org/c/openstack/ansible-role-uwsgi/+/91679013:35
opendevreviewJonathan Rosser proposed openstack/openstack-ansible-ops master: Test all supported versions of k8s workload cluster with magnum-cluster-api  https://review.opendev.org/c/openstack/openstack-ansible-ops/+/91664913:38
noonedeadpunkhuh, interesting indeed14:00
jrosser_the SO article is really quite like what i am seeing for glance14:06
jrosser_but suuuuuper unhelpfully there is no further information past OSError14:07
noonedeadpunkwell, we still avoid uwsgi for glance with ceph. though now we're doing a setup with glance and uwsgi, but it's using swift as backend...14:08
jrosser_excellent docs https://uwsgi-docs.readthedocs.io/en/latest/Options.html#ignore-write-errors14:09
noonedeadpunkhard to argue about annoying part14:10
noonedeadpunknot sure if uploaded images are consistent....14:10
noonedeadpunkmaybe they are...14:10
noonedeadpunkthis all really is very confusing14:11
jrosser_right14:11
jrosser_feels like just randomly stabbing at things with no understanding :(14:11
jrosser_i cant reproduce it either14:12
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible master: Update global pins for 2024.1  https://review.opendev.org/c/openstack/openstack-ansible/+/91679214:21
noonedeadpunkno idea if that will pass ^14:22
jrosser_i did look quickly at updating openstack_hosts for caracal UCA14:23
jrosser_but that seemed to rely on openstack_distrib_code_name14:24
jrosser_was not sure if we wanted to update that yet?14:24
noonedeadpunk#startmeeting openstack_ansible_meeting15:01
opendevmeetMeeting started Tue Apr 23 15:01:03 2024 UTC and is due to finish in 60 minutes.  The chair is noonedeadpunk. Information about MeetBot at http://wiki.debian.org/MeetBot.15:01
opendevmeetUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.15:01
opendevmeetThe meeting name has been set to 'openstack_ansible_meeting'15:01
noonedeadpunk#topic rollcall15:01
noonedeadpunko/15:01
jrosser_o/ hello15:01
noonedeadpunkso, I failed quite a bit to send out PTG results :(15:02
noonedeadpunkthough there were not much things15:02
noonedeadpunkmainly moved things from the past one to the new cycle15:03
noonedeadpunk#topic office hours15:06
noonedeadpunkso, there was a ML thread abour cluster-api deployment with osa: https://lists.openstack.org/archives/list/openstack-discuss@lists.openstack.org/thread/27R2UOTCIHNAAGFFCG36KWHDPFLD3TZ4/15:08
noonedeadpunkI was mainly guessing there as never checked it/played enough myself15:08
noonedeadpunkno idea if folk gave up or made it working though15:08
NeilHanlonhiya folks15:09
noonedeadpunko/15:11
noonedeadpunkI'm actually planning to keep pushing patches this week for quorum queues improvements15:12
noonedeadpunkas this needs to be done across all services - would be good to know a health state overall15:12
noonedeadpunkuntil it's not _too_ late15:12
jrosser_i think that the cluster api guy from the ML showed up in the cluster-api slack channel and i helped out there15:12
noonedeadpunkah15:12
noonedeadpunkok, good to know there's some slack channel :D15:13
jrosser_that was the push i needed to get it fixed up last weekend15:13
noonedeadpunkAlso, if anybody missed, I've opened Answers section in launchpad, where almost instantly landed question around OVN docs: https://answers.launchpad.net/openstack-ansible/+question/70948415:14
jrosser_this is kind of a big problem15:18
noonedeadpunkthat I've opened it?:)15:22
noonedeadpunkor that we're bad in documentation?15:22
noonedeadpunkalso as an update - frrouting role has been moved under our governance. it's needed for ovn-bgp-agent implementation15:26
jrosser_i think the trouble is that OVN is so hard15:26
NeilHanlonnice :D 15:26
NeilHanlonI really need to look into OVN 15:26
noonedeadpunkActually, I don't find it _that_ hard today15:26
NeilHanlonsorry i'm distracted today... covering for a few people15:27
noonedeadpunklike it's mainly just very different15:27
jrosser_and simultaneously the docs do not show a newcomer what all the moving parts are15:27
jrosser_and how to translate what you want into working config15:27
NeilHanloni think the problem is that OVN is itself a huge beast, and so a newcomer to OpenStack **and** OVN will be like "WTF!"15:27
jrosser_right15:27
noonedeadpunkyeah, and I think we lack of examples. Or better say - they're hidden in os_neutron docs15:27
NeilHanlonalso OpenStack Networking in general is pretty counter to what a "Network Engineer" would expect15:28
noonedeadpunkyeah, I think you might be into smth NeilHanlon15:28
NeilHanlonlike I still have to reprogram my brain to think about what OpenStack does with provider/external nets :) 15:28
mgariepyit was always left to the deployer for the network configuration15:30
mgariepystuff can work quite differently with ovn tho.15:30
jrosser_i think it is pretty difficult on top to understand what to put in openstack_user_config15:38
noonedeadpunkyeah15:38
jrosser_there are too many wierdly named fields15:38
mgariepythere are many ways to do stuff also.15:39
jrosser_so primarily this is a documentation issue then 15:39
noonedeadpunkyeah, maybe we should somehow promote neutron_provider_networks which is way more obvious....15:40
noonedeadpunkbut then when I tried to - it was even more confusing on what to add where15:41
NeilHanlonsurely the AI can do it for us15:46
* NeilHanlon ducks15:46
jrosser_this is also not a case of making some quick fix to the docs for some error15:47
jrosser_it needs time/concentration to make a coherent set of stuff for how things actually are with OVN15:47
NeilHanlonyeah15:47
* NeilHanlon will ask in his circles if anyone might be interested in some OSA+OVN documentation 15:48
NeilHanlonwork*15:48
noonedeadpunkwell, I partially did some15:50
noonedeadpunkbut apparently we have more places to fix15:50
noonedeadpunklike deployment guide at least15:50
jrosser_maybe we have to start from what got put in the launchpad answers section15:51
noonedeadpunkyeah15:53
noonedeadpunkthat sounds as a good start15:54
noonedeadpunk#endmeeting15:58
opendevmeetMeeting ended Tue Apr 23 15:58:00 2024 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)15:58
opendevmeetMinutes:        https://meetings.opendev.org/meetings/openstack_ansible_meeting/2024/openstack_ansible_meeting.2024-04-23-15.01.html15:58
opendevmeetMinutes (text): https://meetings.opendev.org/meetings/openstack_ansible_meeting/2024/openstack_ansible_meeting.2024-04-23-15.01.txt15:58
opendevmeetLog:            https://meetings.opendev.org/meetings/openstack_ansible_meeting/2024/openstack_ansible_meeting.2024-04-23-15.01.log.html15:58
opendevreviewMerged openstack/ansible-role-uwsgi master: Add Debian 12 distro setup variable  https://review.opendev.org/c/openstack/ansible-role-uwsgi/+/91508016:02
jrosser_haproxy thinks glance API is down during image upload https://zuul.opendev.org/t/openstack/build/f7e8ae3093e84d438cebd22eda845ce0/log/logs/host/haproxy.service.journal-15-42-20.log.txt#6873-690316:02
jrosser_does uploading an image prevent it responding to a heathheck i wonder16:03
noonedeadpunkah16:07
noonedeadpunkyes, I think we have too less workers for glance16:07
noonedeadpunk*too few16:07
noonedeadpunkhttps://opendev.org/openstack/openstack-ansible/src/branch/master/tests/roles/bootstrap-host/templates/user_variables.aio.yml.j2#L98-L10116:08
noonedeadpunkso really no available workers to respond when upload is in progress16:08
noonedeadpunkI've actually seen same in AIO when tried to upload through horizon16:08
jrosser_we probably get away with that for tempest/cirros as the image is tiny16:09
noonedeadpunkyeah, likely16:09
jrosser_and perhaps my storage here is quick so i have not seen it16:10
noonedeadpunknot sure how it passes with coreos though or amphora...16:10
jrosser_which of those vars actually makes a different16:10
noonedeadpunkI think glance_wsgi_threads ?16:11
noonedeadpunknot sure...16:11
noonedeadpunkyeah, wsgi_threads is for uwsgi16:11
noonedeadpunkand glance_api_threads for non-uwsgi16:12
jrosser_ah ok16:12
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_skyline master: Do not define a random password for each run  https://review.opendev.org/c/openstack/openstack-ansible-os_skyline/+/91233216:12
opendevreviewJames Denton proposed openstack/openstack-ansible-os_skyline master: Support large uploads via Skyline  https://review.opendev.org/c/openstack/openstack-ansible-os_skyline/+/91414916:13
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_skyline master: Install skyline-console through yarn  https://review.opendev.org/c/openstack/openstack-ansible-os_skyline/+/91440516:13
opendevreviewJonathan Rosser proposed openstack/openstack-ansible master: Increase number of threads to 2 for glance in AIO  https://review.opendev.org/c/openstack/openstack-ansible/+/91681016:14
opendevreviewJonathan Rosser proposed openstack/openstack-ansible-ops master: Test all supported versions of k8s workload cluster with magnum-cluster-api  https://review.opendev.org/c/openstack/openstack-ansible-ops/+/91664916:15
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_glance master: Add service policies defenition  https://review.opendev.org/c/openstack/openstack-ansible-os_glance/+/91681216:31
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_glance master: Add service policies defenition  https://review.opendev.org/c/openstack/openstack-ansible-os_glance/+/91681216:32
jrosser_noonedeadpunk: did i show you this? https://bugs.launchpad.net/oslo.messaging/+bug/203151216:34
noonedeadpunkjrosser_: I think I saw except latest Andrew comments16:55
noonedeadpunkso last update I saw was from March I guess16:56
jrosser_yeah so we did more work and also got an answer from the rabbitmq people16:56
jrosser_that the un-named reply queues basically will never be OK16:56
jrosser_and changes to oslo.messaging now make this actually visible where before it was not16:57
opendevreviewStuart Grace proposed openstack/openstack-ansible-ops master: Clarifications to mcapi_vexxhost README  https://review.opendev.org/c/openstack/openstack-ansible-ops/+/91681716:58
noonedeadpunkSo I guess question is - what changes you mean?:)16:58
jrosser_so we plan to switch to HA reply queues (unless there is something terrible we overlooked) - andrew submitted some patches16:59
noonedeadpunkAs I think latest change for 2024.1 was to make transient queues replicated16:59
jrosser_a bug fix i think https://github.com/openstack/oslo.messaging/commit/b4b49248bcfcb169f96ab2d47b5d207b1354ffa816:59
noonedeadpunkaha16:59
jrosser_before that, there was enough time for the queue to be deleted in rabbitmq16:59
jrosser_but now it is quicker to try to reconnect / re-create the queue and it's full-on race condition17:00
noonedeadpunkhttps://github.com/openstack/oslo.messaging/commit/989dbb8aad8be68a9c63e2e6a4d445cc445c051c17:00
noonedeadpunkand basically I wanted to enable that as default in 2024.117:00
jrosser_right yes - so reason i bring this up is that we have on the backlog here a task to make a helper script to do the quorum queue migration17:01
noonedeadpunkWell. I think migration is kinda "handled" by changing name of vhost17:01
noonedeadpunk(currently)17:01
noonedeadpunkAnd I was unable to find anything more efficient17:02
jrosser_kind of, yes, but i think that the downtime might be significant17:02
jrosser_^ if you just do it as part of a regular upgrade17:02
noonedeadpunkyou do that service by service anyway. But yes, compute/neutron might struggle17:02
noonedeadpunkunless you're running ovn17:02
jrosser_indeed - so we were going to look at the upgrade stuff a bit for this17:03
noonedeadpunkbut you still pretty much need to empty out all queues/release all connections to switch17:03
noonedeadpunkas once "old" client connects to host - it creates classic queues and then you can't convert17:03
noonedeadpunk*to vhost17:03
noonedeadpunkso you pretty much need to stop every client connecting to vhost before doing anything17:04
noonedeadpunkwhich is pretty alike to just swap vhost name, and then downtime on operations until playbooks is finished...17:05
noonedeadpunkbut yeah17:05
noonedeadpunkwould be great to improve that17:05
noonedeadpunkpotentially, we might have some kind of tag, but we'd need to both re-configure service and oslo part17:05
noonedeadpunkjsut to skip them during upgrade and execute right afterwards, so config change/restart happened really fast17:06
jrosser_yeah so i think this is what we will look at17:08
jrosser_making sture the playbooks/tags are all set up properly to do some minimal change just for the message queues17:08
noonedeadpunkand eventually - <service>-config tag went completely out of control....17:08
noonedeadpunkit's doing soooooooo many things todat17:09
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_glance master: Add variable to globally control notifications enablement and disable RPC  https://review.opendev.org/c/openstack/openstack-ansible-os_glance/+/91682017:13
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_glance master: Add variable to globally control notifications enablement and disable RPC  https://review.opendev.org/c/openstack/openstack-ansible-os_glance/+/91682017:20
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_glance master: Implement variables to address oslo.messaging improvements  https://review.opendev.org/c/openstack/openstack-ansible-os_glance/+/91682117:20
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_glance master: Implement variables to address oslo.messaging improvements  https://review.opendev.org/c/openstack/openstack-ansible-os_glance/+/91682117:20
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_heat master: Add service policies defenition  https://review.opendev.org/c/openstack/openstack-ansible-os_heat/+/91682617:50
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_heat master: Add variable to globally control notifications enablement  https://review.opendev.org/c/openstack/openstack-ansible-os_heat/+/91682717:53
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_heat master: Implement variables to address oslo.messaging improvements  https://review.opendev.org/c/openstack/openstack-ansible-os_heat/+/91682817:56
opendevreviewChristian Mattsson proposed openstack/openstack-ansible-os_neutron master: Add debian package libstrongswan-standard-plugins  https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/91683218:18
opendevreviewStuart Grace proposed openstack/openstack-ansible-ops master: Clarifications to mcapi_vexxhost README  https://review.opendev.org/c/openstack/openstack-ansible-ops/+/91681718:32
opendevreviewJonathan Rosser proposed openstack/openstack-ansible-ops master: Clarifications to mcapi_vexxhost README  https://review.opendev.org/c/openstack/openstack-ansible-ops/+/91681718:34
noonedeadpunkwell, clusterapi doc seems to be renderred "well" enough as well: https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_afd/916653/3/check/openstack-tox-docs/afd849d/docs/mcapi.html18:36
jrosser_yeah thats good18:41
jrosser_the images need some "retain original aspect ratio" thing, that was also a bit wonky on the elk one18:42
jrosser_and i can also look at including the actual config files out of the repo instead of having duplication18:42
noonedeadpunkI had to drop ratio, as sphinx can't make it for SVG18:44
noonedeadpunkit can for PNG though18:44
jrosser_looks like the glance threads thing was the cause for upload failures18:44
jrosser_the capi one is png i think18:46
noonedeadpunkyeah18:46
noonedeadpunkbut I thought I left scale there if it was there...18:50
noonedeadpunkok, it was `:scale: 100 %` :D18:50

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!