Monday, 2023-04-03

noonedeadpunkmornings07:48
jrossergood morning07:49
noonedeadpunkhuh, https://review.opendev.org/c/openstack/openstack-ansible/+/879069 is weird indeed07:50
noonedeadpunkSounds like https://review.opendev.org/c/openstack/openstack-ansible-openstack_hosts/+/878929/2/vars/redhat-9.yml is off somehow07:51
noonedeadpunkyup, no exclude here https://zuul.opendev.org/t/openstack/build/625d1c3798414708b0f14fdec1054688/log/logs/etc/openstack/aio1_neutron_server_container-1b9896cf/yum.repos.d/rdo-deps.repo.txt07:52
noonedeadpunkbut it's weird - distribution looks nice https://zuul.opendev.org/t/openstack/build/625d1c3798414708b0f14fdec1054688/log/logs/etc/host/openstack_deploy/ansible_facts/aio1_neutron_server_container-1b9896cf.txt#3907:53
jrosserthat looks odd `changed: [aio1] => (item={'name': 'rdo-deps', 'file': 'rdo-deps', 'description': 'rdo-deps', 'baseurl': 'https://trunk.rdoproject.org/centos9-zed/deps/latest/', 'gpgcheck': False, 'module_hotfixes': True, 'exclude': False})`07:57
jrosserexclude: False07:57
jrosseroh i know what it is07:58
jrossertheres always needing to be a set of ( ) "{{ (if foo == bar) | ternary(this, that) }}"07:59
jrosserotherwise it actually tests if foo is equal to bar | ternary(this, that)07:59
noonedeadpunkdamn it08:00
jrosser| is super high precedence operator08:00
noonedeadpunkYeah, I tend to always set round brackets, no idea how I managed to forget this time08:00
noonedeadpunkbtw I've spotted same issue here https://review.opendev.org/c/openstack/openstack-ansible-haproxy_server/+/878771/4/tasks/haproxy_service_config.yml damiandabrowski08:01
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-openstack_hosts master: Fix package exclude condition for rdo-deps  https://review.opendev.org/c/openstack/openstack-ansible-openstack_hosts/+/87927208:05
opendevreviewDamian Dąbrowski proposed openstack/openstack-ansible-haproxy_server master: Fix haproxy_service_configs format conversion  https://review.opendev.org/c/openstack/openstack-ansible-haproxy_server/+/87877110:52
ElnazHi12:20
Elnazhttps://meetings.opendev.org/irclogs/%23openstack-ansible/%23openstack-ansible.2023-03-06.log.html12:21
Elnaz> i have a pretty large deployment of ELK using that repo12:21
Elnazjrosser: ^ Can you explain what architecture you are using ELK with?12:21
opendevreviewMerged openstack/openstack-ansible-openstack_hosts master: Fix package exclude condition for rdo-deps  https://review.opendev.org/c/openstack/openstack-ansible-openstack_hosts/+/87927212:25
admin1if nova is local disk, but cinder/glance was ceph .. i recall we needed to use some direct-path setting to make snapshot work else it will time out ( and also same for using images via horizon ) .. do i recall correctly  ? 12:29
admin1 i am trying to remember the variable 12:29
hamidlotfiHi there,12:40
hamidlotfiAccording to the last post, I wanted to scale my OSA environment with your guide published at `https://docs.openstack.org/openstack-ansible/latest/admin/scale-environment.html12:40
hamidlotfiI added an infra04 in my env with this series of commands you said me in latest my posts:12:40
hamidlotfi` 1# openstack-ansible playbooks/setup-hosts.yml --limit localhost,infra04,infra04-host_containers`12:40
hamidlotfi`2# openstack-ansible playbooks/openstack-hosts-setup.yml`12:40
hamidlotfi`3# openstack-ansible playbooks/setup-infrastructure.yml -e galera_force_bootstrap=true`12:40
hamidlotfi`4# openstack-ansible playbooks/os-keystone-install.yml`12:40
hamidlotfi`5# openstack-ansible playbooks/setup-openstack.yml --limit '!keystone_all',localhost,infra04,infra04-host_containers`12:40
hamidlotfiIn step 5 in the cinder, section show this error 12:40
hamidlotfi`FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: 'ansible.vars.hostvars.HostVarsVars object' has no attribute 'ansible_local'\n\nThe error appears to be in '/opt/openstack-ansible/playbooks/os-cinder-install.yml': line 69, column 7, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n    # venv tag for all hosts in 12:40
hamidlotfithe 'cinder_all' host group.\n    - name: Gather software version list\n      ^ here\n"}`12:40
hamidlotfisorry for long message 😔12:41
hamidlotfi@jrosser 12:47
jrosserhello12:47
jrosser@hamidlotfi can you look at what hosts you have in the cinder_all group?12:48
hamidlotfiLet me check it12:48
hamidlotfi[cinder_all:children]12:49
hamidlotficinder_api12:49
hamidlotficinder_backup12:49
hamidlotficinder_scheduler12:49
hamidlotficinder_volume12:49
hamidlotfi[cinder_api]12:49
hamidlotfi[cinder_backup]12:49
hamidlotfi[cinder_scheduler]12:49
hamidlotfi[cinder_volume]12:49
jrossernoonedeadpunk: ^ this is the "add infra node" equivalent of the "add compute node" facts gathering trouble12:50
jrosserhamidlotfi: hmm ok12:50
jrosserhamidlotfi: the easy answer is to add "cinder_all" to the --limit on step 512:51
jrosserhamidlotfi: i don't think it is possible to write a set of instructions here that works for every possible deployment12:51
hamidlotfiIt means this command like this:12:52
hamidlotfi`openstack-ansible playbooks/setup-openstack.yml --limit '!keystone_all',localhost,infra04,infra04-host_containers,cinder_all`12:52
hamidlotfi@jrosser Is it correct ?12:52
jrosserhamidlotfi: honestly i do not know, because i don't know about your deployment12:53
jrosseryou need to make sure that the play runs against all the hosts in the cinder_all group12:54
hamidlotfiI check it right now12:56
noonedeadpunkI'm pretty sure that ansible-core 2.13 just fixed local facts gathering, so that limit doesn't affect them anymore... But for infra case it's waaay more tricky to add some var to skip these steps13:07
jrosserthese instructions with --limit look unhelpful13:10
noonedeadpunkso we likely should evaluate other ways of deciding if migration is needed or if all hosts are executed13:10
jrosserand still there is the case of wanting to handle having a node down too13:11
noonedeadpunkfor me if node is down - it's not a reason not to execute migration. It will likely cause troubles with this specific node once it's booted, but still migrations should pass.13:15
noonedeadpunkwe probably should have discussed that previous week actually13:17
noonedeadpunkmy bad it wasn't in agenda, as it's worth to be there13:17
noonedeadpunkI'm also trying to think of a good reason to restart all services like we do...13:22
noonedeadpunkAs I'm not sure on why we're doing that at all. Like to get new rpc version we need to update code. If we're updating it - services are restarted regardless. 13:23
noonedeadpunkok, we're running in serial with limits... But again, each service will use default rpc version that's in code, so changing venv or updating packages should be jsut enough to trigger all required restarts13:24
NeilHanlonbtw it does look like rocky 9 will end up with Python 3.11, so we should be "okay" in that respect13:25
noonedeadpunkNeilHanlon: will it end up the same way like centos usually does - without any extra pre-built libs?13:25
NeilHanlonunsure :( but i will find out more.. we just noticed some new python packages in the RHEL 8.8 beta13:26
noonedeadpunklike libselinux-python or python-libvirt?13:26
noonedeadpunkas without that it's close to be useless13:26
noonedeadpunkSo another usecase we have - db migrations. I kind of wonder if there might be a way to check if they are needed using nova-manage itself13:30
hamidlotfi@jrosser It seems to pass the error.13:30
noonedeadpunkor we can do like we do for all other services - running against last host in the group13:31
noonedeadpunkSo technically, I want to just drop all that https://opendev.org/openstack/openstack-ansible/src/branch/master/playbooks/os-nova-install.yml#L31-L163 if it's possible ofc. But I don't have any multi-node sandbox to play with atm...13:32
noonedeadpunkAlso checking that https://docs.openstack.org/nova/latest/cli/nova-manage.html#db-online-data-migrations - it says "after upgrading database schema and nova services on all controller nodes"13:34
noonedeadpunkSo this is weird then https://opendev.org/openstack/openstack-ansible-os_nova/src/branch/master/tasks/main.yml#L292-L30213:34
noonedeadpunkWe should really refactor all that....13:36
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible master: Fix dstat run in gates  https://review.opendev.org/c/openstack/openstack-ansible/+/87935515:01
damiandabrowskiwhat is the current status of gates? are they still broken?16:05
damiandabrowskiopenstack-ansible-deploy-infra_lxc-centos-9-stream failed twice for https://review.opendev.org/c/openstack/openstack-ansible-haproxy_server/+/87877116:06
damiandabrowskifatal: [aio1_repo_container-40ec3906]: FAILED! => {"changed": false, "msg": "Could not find the requested service systemd-tmpfiles-setup-dev: host"}16:06
noonedeadpunkdamiandabrowski: https://review.opendev.org/c/openstack/openstack-ansible/+/879069/2 is needed16:33
noonedeadpunkyou can use depends-on16:33
damiandabrowskiahhh, thanks16:34
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible master: Fix dstat run in gates  https://review.opendev.org/c/openstack/openstack-ansible/+/87935516:39
admin1is there a limit or timeout somehow when using local storage and glance (ceph ) based snapshots 17:07
admin1i get a  Broken pipe 17:07
admin1i think it was some option in haproxy regarding http vs tcp 17:07
admin1i recall this info in bits and pieces  from conversation here 17:07
admin1the best i recall is glance haproxy backend to be in tcp vs http 17:13
damiandabrowskiglance can run behind uwsgi or standalone17:24
damiandabrowskiif you want to avoid issues with ceph, i'd recommend disabling uwsgi17:24
damiandabrowskiwait, I should have a bug report somewhere17:24
admin1is there a way to use variables to do that ?  disable uwsgi and change http -> tcp ? 17:24
admin1i am downloading  the 2020 irc logs we had to check it out 17:26
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible master: Fix dstat run in gates  https://review.opendev.org/c/openstack/openstack-ansible/+/87935517:26
damiandabrowskiwe have disabled uwsgi for glance with glance_use_uwsgi: False17:28
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible master: Tune settings in galera server for reduced ram in all-in-one build  https://review.opendev.org/c/openstack/openstack-ansible/+/87727817:29
damiandabrowskiseems like switching to http is also an option but i didn't test it: https://bugs.launchpad.net/glance/+bug/191648217:30
damiandabrowskiperhaps you can switch to tcp by overriding haproxy_balance_type with haproxy_glance_api_service_overrides17:30
damiandabrowskiswitching to tcp is also an option*17:32
noonedeadpunkadmin1: you need either disable uwsg or use tcp. `glance_use_uwsgi` is the variable to control that17:34
noonedeadpunkBut we have another bug that service likely won't be restarted when you change this variable.17:34
admin1haproxy or glance ? 17:35
noonedeadpunkswitching to tcp is way less trivial17:35
noonedeadpunkglance-api17:35
admin1yeah  .. is that setting alone enough to fix it ? 17:36
noonedeadpunkyup17:36
noonedeadpunkprobably we should disable it by default to be frank... or at least when we see that ceph is going to be used17:38
jrossernoonedeadpunk: we have both ceph read and write caches working here - is that something for ceph_client role?17:42
jrosserit’s really all done on the computes17:43
noonedeadpunkWhat do you have on mind for the role? As I guess you can place custom ceph.conf there or do overrides?17:44
jrosserthere’s some directories to make with the right permission, and a drop in is needed for the read cache service17:44
jrosserand the package for the read cache installing actually17:45
jrosserso pretty small really17:45
noonedeadpunkwell, we can introduce couple of new vars then I assume?17:47
jrosseryes, two bools I guess to enable the read / write caches, and some defaults for dir paths… something like that17:48
noonedeadpunk++17:48
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_glance master: Disable uWSGI if ceph is used as a store  https://review.opendev.org/c/openstack/openstack-ansible-os_glance/+/87937017:48
jrosserwe got 3x throughput with the read cache17:49
jrosserand similar with write, even with nvme osds17:49
jrosserwell write was more ops/sec actually, depends how you read FIO output17:50
noonedeadpunkWell, I assume for write it's a bit different though, as it jsut does them in an async way17:50
noonedeadpunkAs it ack write without latency (with latency of local drive)17:50
jrosserright, but interestingly it seems also to be able to read from it17:50
jrosserfor recently written stuff17:50
noonedeadpunkHuh, interesting... I'm kind of more sceptical about writes, as I'm quite afraid of what will be consequences of local drive failure17:51
noonedeadpunkAs I'd assume they will worn out really fast17:51
jrosseroh indeed - it’s all totally workload and dependant on consequences of failure17:52
jrosseranyway - I will see if I can write some code as there’s basically no example anywhere of making this work17:53
noonedeadpunkI have no idea about failure, I assume it will result in broken filesystems? As it should not block PGs, as PGs have no idea about locally cached actions I guess...17:53
noonedeadpunkAnd well, compute failures is another interesting case - I guess you don't want to evacuate VMs with that anymore unless you really don't have any chance of bringing in back compute17:54
noonedeadpunkbut read cache is really appealing17:55
jrosseryeah, it’s only caching the parent you snapshot from though, not anything you write subsequently17:56
jrosserbut for CI it’s probably really worth it, or any use case with huge read only datasets17:56
noonedeadpunkyeah, totally worth having sample/docs on how to do that as it's very interesting and might worth a risk - it's anyway not worse then just local drives but with comparable performance17:57
noonedeadpunkbtw interestingly if live-migration works with write cache on vms with intense iops...17:58
jrosseras far as compute failure goes the cache is integrated with the lock on the volume17:58
noonedeadpunkyeah, so you're having exclusive lock, right17:58
jrosseroh yes live migration :) didn’t try that as it’s only set up on one node right now….18:00
noonedeadpunkI'd assume race conditions are possible there... 18:01
noonedeadpunkBut I didn't test any of these for the matter of fact, so everything I'm saying is just speculation18:04
noonedeadpunkor fud :D18:04
admin1the unable to restart when doing glance_use_uwsgi: False   .. is that one time .. or every time ? 18:05
admin1because if its one time, i can manually stop/start the lxc container for example18:06
damiandabrowskii can recall a situation when switching between glance uwsgi<>standalone was leaving the old process running and I had to kill it manually18:15
damiandabrowskimaybe the same happened to you18:15
jrosserI guess it does “do the not uwsgi thing” rather than “undo the uwsgi thing”18:16
admin1changing to tcp seems to have worked . testing with a bigger image/snapshot 19:09
admin1thanks guys 19:09
opendevreviewMerged openstack/openstack-ansible master: Disable CentOS LXC jobs due to the bug in systemd packaging  https://review.opendev.org/c/openstack/openstack-ansible/+/87906919:22
opendevreviewJonathan Rosser proposed openstack/openstack-ansible-ceph_client stable/zed: Add EPEL GPG key for RHEL 9  https://review.opendev.org/c/openstack/openstack-ansible-ceph_client/+/87918620:02
opendevreviewJonathan Rosser proposed openstack/openstack-ansible-ceph_client stable/zed: Add thrift to includepkgs from EPEL  https://review.opendev.org/c/openstack/openstack-ansible-ceph_client/+/87918720:03
opendevreviewJonathan Rosser proposed openstack/openstack-ansible-openstack_hosts stable/zed: Add openstack_hosts_file tag  https://review.opendev.org/c/openstack/openstack-ansible-openstack_hosts/+/87918820:03
opendevreviewJonathan Rosser proposed openstack/openstack-ansible-openstack_hosts stable/yoga: Add openstack_hosts_file tag  https://review.opendev.org/c/openstack/openstack-ansible-openstack_hosts/+/87918920:04
opendevreviewJonathan Rosser proposed openstack/openstack-ansible-openstack_hosts stable/xena: Add openstack_hosts_file tag  https://review.opendev.org/c/openstack/openstack-ansible-openstack_hosts/+/87939020:04
opendevreviewJonathan Rosser proposed openstack/openstack-ansible master: Tune settings in galera server for reduced ram in all-in-one build  https://review.opendev.org/c/openstack/openstack-ansible/+/87727820:30
opendevreviewDamian Dąbrowski proposed openstack/openstack-ansible master: Add support for TLS backends  https://review.opendev.org/c/openstack/openstack-ansible/+/87908520:30
opendevreviewDamian Dąbrowski proposed openstack/openstack-ansible master: Add support for TLS backends  https://review.opendev.org/c/openstack/openstack-ansible/+/87908520:33
opendevreviewDamian Dąbrowski proposed openstack/openstack-ansible-os_keystone master: Define CA cert when needed  https://review.opendev.org/c/openstack/openstack-ansible-os_keystone/+/87937820:35
opendevreviewDamian Dąbrowski proposed openstack/openstack-ansible-os_keystone master: Rename keystone_ssl to keystone_backend_https  https://review.opendev.org/c/openstack/openstack-ansible-os_keystone/+/87937920:35
opendevreviewDamian Dąbrowski proposed openstack/openstack-ansible-os_placement master: Add TLS support to placement backends  https://review.opendev.org/c/openstack/openstack-ansible-os_placement/+/87938020:37
opendevreviewDamian Dąbrowski proposed openstack/openstack-ansible-os_nova master: Add TLS support to nova API backends  https://review.opendev.org/c/openstack/openstack-ansible-os_nova/+/87481020:49
opendevreviewMerged openstack/openstack-ansible-plugins master: Revert "Ensure systemd-udev is installed for gluster"  https://review.opendev.org/c/openstack/openstack-ansible-plugins/+/87884221:43
opendevreviewDamian Dąbrowski proposed openstack/openstack-ansible-os_nova master: Add TLS support to nova API backends  https://review.opendev.org/c/openstack/openstack-ansible-os_nova/+/87481023:44

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!