Thursday, 2021-12-16

opendevreviewMerged opendev/system-config master: zuul-*: use multiline formatter  https://review.opendev.org/c/opendev/system-config/+/82150800:06
corvusweb is back00:16
corvusoh sort of a shame that hadn't merged... could always roll again00:16
fungiyeah, ianw was asking if we should wait for that00:17
fungii think he was offering to do the restart once it landed00:17
ianwi can if we like, or just wait till next time00:20
ianwhopefully i won't have immediate need to debugging any multi-line errors :)00:20
corvusoh sorry i thought that was a conversation with clarkb on a different subject00:22
corvusi understand how to demultiplex the multiple conversations now, but i did not at the time.00:22
fungiianw: i think it's more about test-driving what should be included in zuul 4.11.0, if we don't test drive zuul with that change in place, then we may not want to include it in that release (or maybe just roll the dice, it's tested anyway)00:23
ianwfungi: oh, well in this case the multiline logger has been in for a long time, we just have been overriding it with our log config00:24
fungiaha, nevermind then00:24
fungii agree it's not as urgent to add. now i see that was a system-config change anyway, not a zuul change00:24
fungiit'll be in 4.11.0 regardless00:25
fungiwe just don't know if we might discover bugs with it00:26
ianwclarkb: it looks like you got errno 111 (connection refused) instead of 113 (no route to host)?00:26
ianwthat ... seems right00:27
fungiwell, no route to host is what we expect from the firewall's default reject rule00:27
fungiunless we have a tcp-specific rule setting --reject-with tcp-reset00:28
ianwinteresting ... certainly proven it's a good thing to be running tests against00:29
clarkbhrm ya I was getting 113 against prod00:31
clarkbfungi: ianw: any idea why that would've happened in testing?00:31
fungihitting an interface we're allowing traffic to but the service isn't listening on?00:32
clarkboh I see it ya00:33
clarkbits talking to 127.0.0.1 because I asked the zk host for its address00:33
clarkbthat an interesting behavior00:33
ianw127.0.1.1 00:33
clarkbI need to get the zk address some other way to get the external addresses00:33
clarkbor maybe 127.0.1.1 and the external are both included. I'll strip out 127\..* to check that00:33
clarkbyup that is it00:34
fungimake sure you exclude ::1 as well00:34
clarkb++00:35
ianwyou could also probably just connect to "zk0X.opendev.org" in the connect() dirrectly?  that will look it up on bridge?00:35
opendevreviewClark Boylan proposed opendev/system-config master: Add firewall behavior assertions to testinfra testing  https://review.opendev.org/c/opendev/system-config/+/82178000:36
clarkbianw: ya, but my concern with that is I might get the actual prod host00:37
clarkbI figured looking up the name on the system configured with that hostname was going to be most reliable even if we didn't do /etc/hosts for all hosts in the multinode job00:37
opendevreviewJames E. Blair proposed zuul/zuul-jobs master: WIP: Switch docs theme to RTD  https://review.opendev.org/c/zuul/zuul-jobs/+/82191800:37
ianweither or; you'd also have AF_INET and AF_INET6 to contend with 00:38
ianwalthough i do think in other tests we've made the assumption that the host resolution is working00:38
ianwit is another good argument for making the testing hosts like zk99 as well00:38
clarkbya looking at it we aren't actually getting the AF_INET6 ip for some reason00:42
clarkbI wonder if we don't set up the ipv6 addr in /etc/hosts00:42
ianwperhaps not if we couldn't do it consistently?00:45
clarkbya that may be why00:46
fungiahh, no global v6 in some test providers, yep00:47
clarkbwe also do the overlay networks over ipv4 only beacuse linux didn't support vxlan over ipv6 for the longest time00:47
opendevreviewwangxiyuan proposed openstack/project-config master: Add openEuler disto support for elements  https://review.opendev.org/c/openstack/project-config/+/82179401:37
fungi#status log Our jitsi-meet services including meetpad.opendev.org are shut down temporarily again, out of an abundance of caution awaiting newer images01:38
opendevstatusfungi: finished logging01:38
join_sublinecurious, is there a ballpark stats of the number of users using the meetpad per hour,day,week,etc .  i've used meetpad, meet.jit.si, and 8x8.vc .  have gotten the audio to crash on meet.jit.si, and experienced UI lag / freezing with 8x8.vc .  but overall, very useful webservice02:03
opendevreviewIan Wienand proposed openstack/diskimage-builder master: centos: work around 9-stream BLS issues  https://review.opendev.org/c/openstack/diskimage-builder/+/82177202:04
opendevreviewMerged openstack/project-config master: Add openEuler disto support for elements  https://review.opendev.org/c/openstack/project-config/+/82179402:21
*** rlandy|ruck|bbl is now known as rlandy|ruck02:30
*** rlandy|ruck is now known as rlandy|out02:38
clarkbjoin_subline: I think it tends to be more in bursts rather than sustained. Like when we have an event that uses it03:01
join_sublineah, ic.  well, i think i must be close to being in the top tier of etherpad.opendev.org powerusers.  i write so much into this (mostly just copying links), but thnxs for giving me a place to put all my online stuff.  that reminds me, i should run my download script for the /p/<pad>/export/txt , so i have a copy of everything i put up. 🤞03:10
*** pojadhav|afk is now known as pojadhav04:30
*** ysandeep|out is now known as ysandeep06:22
opendevreviewMerged openstack/diskimage-builder master: Install only python3 pip in debian bullseye  https://review.opendev.org/c/openstack/diskimage-builder/+/82056306:36
opendevreviewVishal Manchanda proposed openstack/project-config master: Add "Review-Priority" label to horizon project  https://review.opendev.org/c/openstack/project-config/+/82193406:36
*** ysandeep is now known as ysandeep|lunch07:38
*** jpena|off is now known as jpena08:01
*** TheMaster is now known as Unit19308:10
*** ysandeep|lunch is now known as ysandeep08:17
opendevreviewMerged openstack/project-config master: Add openstack-venus irc channel in access an gerrit bot  https://review.opendev.org/c/openstack/project-config/+/82187508:59
wxy-xiyuan_ianw: the image build works now https://nb02.opendev.org/openEuler-20.03-LTS-SP2-0000000055.log Is there any place I can get the nodepool-launcher log? Thanks.09:36
opendevreviewMerged opendev/system-config master: Add openstack-venus channel in statusbot  https://review.opendev.org/c/opendev/system-config/+/82188209:54
fricklerwxy-xiyuan_: I don't think there is. I'll try to take a look myself09:54
fricklerwxy-xiyuan_: infra-root: there are errors uploading the openEuler image which I don't understand yet10:02
*** ysandeep is now known as ysandeep|afk10:32
wxy-xiyuan_frickler: Thanks for help10:36
fricklerwe seem to have a bug when the image name contains a "."10:44
fricklerFileNotFoundError: [Errno 2] No such file or directory: '/opt/nodepool_dib/openEuler-20.vhd'10:44
frickler-rw-r--r-- 1 nodepool nodepool 21615583744 Dec 16 05:27 /opt/nodepool_dib/openEuler-20.03-LTS-SP2-0000000055.vhd10:45
fricklerinfra-root: ^^ I can't debug further now, please have a look10:48
*** sshnaidm|afk is now known as sshnaidm10:54
*** ysandeep|afk is now known as ysandeep11:16
*** rlandy|out is now known as rlandy|ruck11:17
*** dviroel|out is now known as dviroel|rover11:26
*** pojadhav is now known as pojadhav|brb12:30
*** pojadhav|brb is now known as pojadhav12:52
fungiseems likely to be a greedy filename chop in the builder14:25
fungii'll try to look shortly14:25
funginot finding any obvious places yet where we make assumptions about filenames containing only one '.'14:34
fungibut there's a lot of places in nodepool.builder where the filename gets touched or passed through14:35
fungialso possible this is happening inside the openstack sdk14:43
fungiit may be easier to insert more debugging statements into nodepool in order to track down at what point the image name gets corrupted14:44
opendevreviewJames E. Blair proposed zuul/zuul-jobs master: Switch docs theme to RTD  https://review.opendev.org/c/zuul/zuul-jobs/+/82191814:48
fricklerfungi: maybe rather amend the image name to -20-03- for now?14:55
fricklerthe other thing that worries me is that nodepool seems to be looping trying to redo the upload every 10s or so14:56
clarkbI suspect the with_suffix in DibImageFile.to_path() is to blame15:01
clarkbyes just reproduced15:02
clarkbhttps://paste.opendev.org/show/bbXSvqbxVTJ7D6WBHo5N/15:02
clarkbI think that behavior is correct from pathlib. Everything after the last . in the /opt/nodepool_dib/openEuler-20.03-LTS-SP2-0000000055 name is treated as an extension and replaced when we set the .vhd extension on it15:04
clarkbProbably best to remoev the .'s from the image name for now. Unless someone has a good idea for fixing that15:04
fungii'll have to look closer after meetings are over15:04
opendevreviewMerged zuul/zuul-jobs master: Switch docs theme to RTD  https://review.opendev.org/c/zuul/zuul-jobs/+/82191815:26
*** dviroel|rover is now known as dviroel|rover|lunch15:51
*** ysandeep is now known as ysandeep|out16:34
*** dviroel|rover|lunch is now known as dviroel|rover16:42
*** marios is now known as marios|out16:45
*** jpena is now known as jpena|off16:58
opendevreviewJeremy Stanley proposed opendev/system-config master: Add a domain aliases mechanism to lists.o.o  https://review.opendev.org/c/opendev/system-config/+/82191416:59
opendevreviewJeremy Stanley proposed opendev/system-config master: Create an OpenInfra Foundation staff ML  https://review.opendev.org/c/opendev/system-config/+/82191516:59
opendevreviewJeremy Stanley proposed opendev/system-config master: Forward messages for OpenInfra Foundation staff ML  https://review.opendev.org/c/opendev/system-config/+/82191616:59
clarkbthose changes lgtm though exim is always magic to me :)17:28
fungiyeah, turing-complete configuration languages always seem like a good idea at the begining, but sometimes you end up spilling your programs guts all over the user as time goes on17:31
fungianyway, the reason that got complicated is i realized that while our mailrouting through the mta is virtual domain aware such that we can have the same list name on multiple domains, the rudimentary /etc/aliases format (in order to be backward-compatible with sendmail) isn't17:34
clarkbah17:34
fungianyway, assuming the testinfra test i added in 821916 passes, i'll stack a dnm break on top and set an autohold so i can manually test message delivery through the mailrouted added by 82191417:39
fungihttps://zuul.opendev.org/t/openstack/build/4f6f3417f10c4f65b01f0890cac0ab96/log/lists.openstack.org/aliases.domain.txt definitely has the content i want exim to act on, and the added test also confirms it. actually exercising that forward, on the other hand, is a little more tricky so i'll use an autohold17:51
clarkbsoudns good17:51
fungii theory we could have testinfra send something to the old address over localhost and then look for signs in the log that it tried to use the new address instead, but probably not worth it for now17:53
fungialso test nodes may not actually accept deliveries for the production hostnames in their current state17:53
opendevreviewJeremy Stanley proposed opendev/system-config master: DNM: break mailman testing for an autohold  https://review.opendev.org/c/opendev/system-config/+/82200718:03
kopecmartinClark[m]: hi, just out of curiosity, how many resources were needed to make openstack-health run? i guess storage was the biggest one18:10
clarkbkopecmartin: the biggest issue is people. We need people to update the configuration management of the system, fix it when it break (like it is currently broken), upgrade the services and so on18:11
fungikopecmartin: yeah, probably storage, we've got a 0.5tb mysql database (trove instance)18:11
fungifrom a non-wetware perspective18:11
clarkbthe hosting is an aspect of it, but like the ELK stuff all of the work there needs maintenance to bring it up to speed on current practices and supported operating systems18:11
clarkbSpecifically for openstack-health we need to update the deployments of the subunit2sql workers, the health api server, and status.o.o. The operating systems need to be upgraded and the configuration management needs to be converted from puppet to ansible + docker18:12
clarkbThen the health software also needs maintenance/updating but gmann would have more info on that18:13
kopecmartinyeah , people are the crucial point of this, i know .. all said on the call still stands, i was just wondering about the resources 18:14
clarkbkopecmartin: for the resources its a couple of subunit2sql workers I think they are 4vcpu 4GB memory. The api server and the web server hwich is also reasonably small (8vcpu + 8GB memory?) and then a large database server18:15
clarkb*for the hardware resource18:15
kopecmartinthanks 18:16
gmannyeah, we do not have anyone currently to maintain the software (repo) itself. even 1 person is enough to keep it in working condition as not much new things to implement but just bug fixes if it is broken18:20
gmanndeveloper with JS skill18:20
opendevreviewJeremy Stanley proposed opendev/system-config master: Add a domain aliases mechanism to lists.o.o  https://review.opendev.org/c/opendev/system-config/+/82191419:22
opendevreviewJeremy Stanley proposed opendev/system-config master: Create an OpenInfra Foundation staff ML  https://review.opendev.org/c/opendev/system-config/+/82191519:22
opendevreviewJeremy Stanley proposed opendev/system-config master: Forward messages for OpenInfra Foundation staff ML  https://review.opendev.org/c/opendev/system-config/+/82191619:22
fungiclarkb: ^ through manual testing of the held node i found a problem with my domain aliases mailrouter, and also some options for simplifying it19:22
clarkbcool I'll rereview shortly19:25
fungii tested that with both staff@lists.openstack.org and staff@lists.openinfra.dev mailing lists created, messages to the former were delivered to the latter mailing list as intended, while messages to the latter also went directly to the list as usual. then i swapped the alias around and tested both of them again, just to make sure the behavior was reversed and we weren't simply delivering to19:25
fungiwhatever the first list it found happened to be19:26
fungitesting was done using the mail utility locally to inject messages through the loopback interface so that it wouldn't hit the egress reject we added19:26
fungiand then i examined mailman's smtp log for each of the lists involved19:27
fungiso thorough enough to exercise the exim configuration anyway19:28
clarkbseems to do what it says on the tin +219:33
fungithanks19:37
fungiif one more infra-root has a chance to at least look at 821914 i'll feel pretty good about the direction there, the one after it (821914) is just a simple ml addition19:41
fungier, 821915 is the one after it i meant19:42
fungionce both of those deploy, assuming no new errors arise with the list creation, i'll work on the manual steps for copying the list config and subscribers in preparation for landing the forwarding change (821916)19:43
opendevreviewIan Wienand proposed openstack/project-config master: nodepool: Remove . from openEuler name  https://review.opendev.org/c/openstack/project-config/+/82204620:40
clarkboh right I meant to check if that had been done yet20:41
clarkbwe might need to manually clean up the old files but that shouldn't be too much trouble. I've approved ^20:42
fungithanks ianw, i too hadn't gotten to doing that yet20:47
fungishould we switch it to all lower-case as well? all our other labels are20:49
funginow would be the time to decide, lest we have to clean up twice20:49
ianwfungi: no probs -- since i'm not really here today and it is going in i think it's ok20:51
ianwprobably good idea to have some mixed case and periods (for full-stops as we more civilised people call them :) to shake this out20:52
ianwi've put really fixing it in nodepool on the todo20:52
fungiawesome, thanks again20:53
fungiand if you're not really here today, you should get on with not being here! ;)20:53
fungii'll go ahead and approve the domain aliases addition for lists.o.o as well as the new staff ml creation for lists.openinfra.dev so i can try to work on the list move some this evening20:55
opendevreviewIan Wienand proposed openstack/diskimage-builder master: centos: work around 9-stream BLS issues  https://review.opendev.org/c/openstack/diskimage-builder/+/82177220:56
opendevreviewMerged openstack/project-config master: nodepool: Remove . from openEuler name  https://review.opendev.org/c/openstack/project-config/+/82204620:58
fungii also removed the list servers from the emergency disable list a little while ago, now that we believe our orchestrated list creation to be working21:00
fungiand i need to start putting dinner together while i wait for those to merge and deploy21:01
fungiclarkb: the . removal deployed, we're clear to delete the images on disk i guess?21:09
Clark[m]Maybe check that it isn't trying to upload the image still first?21:12
Clark[m]Just finishing up lunch then I can take a look too if that helps21:12
fungii'll try to look in a moment21:12
*** dviroel|rover is now known as dviroel|out21:21
fungionce i finish eating21:25
fungiokay, back and working on image cleanup21:50
fungiopenEuler-20.03-LTS-SP2-0000000055 is ready and openEuler-20-03-LTS-SP2-arm64-0000000001 is building, so it should be safe to dib-image-delete openEuler-20.03-LTS-SP2-000000005521:54
Clark[m]++21:54
Clark[m]Except you might have to delete the records from zk too? Not sure how it handles that21:55
fungiimage-list reports several failed state openEuler-20.03-LTS-SP2 images in ovh-bhs121:55
Clark[m]nodepool dib-image-list should tell you if there is still a build record for the old one21:55
fungioh, bad news... it seems to want to build both21:56
fungiafter the dib-image-delete there's a openEuler-20.03-LTS-SP2-0000000056 in building state21:56
fungiaha, the configuration on nb02 isn't updated21:57
fungi-rw-r--r-- 1 root root 14872 Dec 14 06:43 /etc/nodepool/nodepool.yaml21:58
fungishouldn't the deploy job have taken care of that?21:58
fungisame for nb0121:58
fungionly the arm64 builder config (nb03) was updated21:59
fungimaybe the fix was incomplete21:59
fungiyep, that's it22:00
opendevreviewJeremy Stanley proposed openstack/project-config master: nodepool: Remove yet still more . from openEuler  https://review.opendev.org/c/openstack/project-config/+/82205222:03
fungiclarkb: ianw: ^22:03
clarkbapproved22:10
fungimuch obliged22:10
fungii'll see what remains to be cleaned up once that deploys22:10
opendevreviewMerged openstack/project-config master: nodepool: Remove yet still more . from openEuler  https://review.opendev.org/c/openstack/project-config/+/82205222:21
opendevreviewMerged opendev/system-config master: Add a domain aliases mechanism to lists.o.o  https://review.opendev.org/c/opendev/system-config/+/82191423:14
opendevreviewMerged opendev/system-config master: Create an OpenInfra Foundation staff ML  https://review.opendev.org/c/opendev/system-config/+/82191523:18
rlandy|ruckhello ... maybe I'm late to notice this - but we have a lot of tox failures going on across various products: 23:59
rlandy|ruckhttps://4effb742a88be8659f07-40bd60678638a1db566d5d37b438f20d.ssl.cf5.rackcdn.com/822049/1/gate/openstack-tox-py36/8e12896/job-output.txt23:59

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!