Monday, 2022-12-05

ianwi guess we don't have --upgrade in our pip call, so bridge hasn't pulled in ansible <8 on the deployment00:06
ianwi wonder if it's a better idea to write out a requirements file, and run pip with --update if the requirements file changes00:10
ianwotherwise i feel like it's constantly hitting pypi00:10
BlaisePabon[m]<ianw> "otherwise i feel like it's..." <- Have you considered changing your piprc to point to a local cache?00:40
BlaisePabon[m]Pulp project has a nice pypi (and other) server.00:40
ianwBlaisePabon[m]: that seems to just move the problem from how to update the production install to how to update the cache :)00:53
ianweither way i think maybe not a requirements file, but a stamp file from a template should be idempotent.  if it changes, the add the --update flag00:53
ianwi shouldn't have started looking, because now ansible has moved to collections, the way we pick the versions doesn't make a lot of sense now02:39
opendevreviewIan Wienand proposed opendev/system-config master: bootstrap-bridge: Add cautionary note on installation of Ansible from git  https://review.opendev.org/c/opendev/system-config/+/86654103:45
opendevreviewIan Wienand proposed opendev/system-config master: [wip] overhaul install ansible requirements for 2022  https://review.opendev.org/c/opendev/system-config/+/86654203:45
opendevreviewIan Wienand proposed opendev/system-config master: [wip] overhaul install ansible requirements for 2022  https://review.opendev.org/c/opendev/system-config/+/86654203:52
opendevreviewIan Wienand proposed opendev/system-config master: [wip] overhaul install ansible requirements for 2022  https://review.opendev.org/c/opendev/system-config/+/86654203:58
opendevreviewIan Wienand proposed opendev/system-config master: [wip] overhaul install ansible requirements for 2022  https://review.opendev.org/c/opendev/system-config/+/86654204:10
opendevreviewIan Wienand proposed opendev/system-config master: [wip] overhaul install ansible requirements for 2022  https://review.opendev.org/c/opendev/system-config/+/86654204:16
*** yadnesh|away is now known as yadnesh04:30
*** marios is now known as marios|ruck06:09
ianwi've received no email from gerrit today :/06:22
ianwlooking now06:22
ianwsigh ... SMTP error from remote mail server after RCPT TO:<my@email>: 550 zen.mimecast.org https://www.spamhaus.org/sbl/query/SBLCSS06:23
ianwDelisting successful06:35
ianwNice work and congratulations!06:35
ianwfrom the logs, @redhat.com, @hp.com seem the most affected, with a handful of other addresses showing the same thing06:38
ianw#status log delisted review.opendev.org from Spamhaus blocklist, several coporate domains were rejecting Gerrit mail06:39
opendevstatusianw: finished logging06:39
*** ysandeep__ is now known as ysandeep|afk07:49
*** yadnesh is now known as yadnesh|afk08:01
opendevreviewIan Wienand proposed opendev/system-config master: install-ansible: overhaul install ansible requirements  https://review.opendev.org/c/opendev/system-config/+/86654208:20
ianwclarkb / fungi : ^ that one will actually get ansible 7 deployed to bridge.  it's a bit long, but a lot of code-removal and i think a good simplification for the current era08:21
*** jpena|off is now known as jpena08:42
*** ysandeep|afk is now known as ysandeep09:10
*** benj_79 is now known as benj_709:15
*** yadnesh|afk is now known as yadnesh09:24
*** ysandeep is now known as ysandeep|brb10:49
*** ysandeep|brb is now known as ysandeep10:59
*** dviroel|afk is now known as dviroel11:05
*** rlandy|out is now known as rlandy|rover11:10
*** pojadhav is now known as pojadhav|brb11:31
*** yadnesh is now known as yadnesh|afk11:45
*** pojadhav|brb is now known as pojadhav12:23
*** yadnesh|afk is now known as yadnesh12:26
*** frenzy_friday is now known as frenzy_friday|food12:40
*** arxcruz is now known as arx|202313:08
*** dasm|off is now known as dasm13:38
*** yadnesh is now known as yadnesh|away14:21
*** pojadhav is now known as pojadhav|afk14:38
*** ysandeep is now known as ysandeep|dinner14:52
fungishortly i'll lower the ttls on the address records for lists.opendev.org and lists.zuul-ci.org in order to facilitate faster updates during the 20:00 utc maintenance15:02
Clark[m]fungi: did the new server get zuul-ci lists pre created or just OpenDev.org?15:08
Clark[m]https://opendev.org/opendev/system-config/src/branch/master/inventory/service/host_vars/lists01.opendev.org.yaml#L214 I think means only OpenDev.org stuff is prepped15:09
Clark[m](also looking like a slow start for to the day for me here but will try to be helpful)15:10
Clark[m]I suppose you could probably do the migration then uncomment those lines too? Just wanted to call that out as there is time to modify if necessary 15:16
corvusi have cleared the moderation queues on the zuul lists15:33
Tenguhello there! I guess it's therefore not the right time to request reviews and, if possible, merge on my changes here? https://review.opendev.org/c/opendev/system-config/+/866175 + https://review.opendev.org/c/openstack/project-config/+/866475 (beware that last one, it has already bitten us last week)15:35
Tengu(seeing the "maintenance" mentioned earlier.15:36
fungiClark[m]: oh, good call. i'll push up the patch to uncomment those in a moment15:38
fungiTengu: oh, it's a fairly light maintenance and not starting for another 4+ hours anyway, but details are at https://lists.opendev.org/pipermail/service-announce/2022-December/000049.html15:39
*** pojadhav|afk is now known as pojadhav15:40
Tengu"light" :915:41
Tengustill - if those patches could be nudged then... :)15:42
*** dviroel is now known as dviroel|lunch16:00
opendevreviewJeremy Stanley proposed opendev/system-config master: Create lists.zuul-ci.org on the Mailman v3 server  https://review.opendev.org/c/opendev/system-config/+/86659916:07
fungicorvus: Clark[m]: ^16:07
*** ysandeep|dinner is now known as ysandeep16:13
*** ysandeep is now known as ysandeep|out16:15
*** frenzy_friday|food is now known as frenzy_friday16:24
*** marios|ruck is now known as marios|out16:35
clarkbfungi: +2, you may want to go ahead and approve that if trying to get it in before the maintenance window as that modifies the inventory so will take a while to apply iirc16:37
fungiyep, i'm working on the dns changes now16:38
fungii'll approve it16:38
*** dviroel|lunch is now known as dviroel16:51
opendevreviewJeremy Stanley proposed opendev/zone-opendev.org master: Temporarily lower the address TTLs for lists  https://review.opendev.org/c/opendev/zone-opendev.org/+/86660417:06
opendevreviewJeremy Stanley proposed opendev/zone-opendev.org master: Temporarily CNAME lists to review for deferral  https://review.opendev.org/c/opendev/zone-opendev.org/+/86660517:06
opendevreviewJeremy Stanley proposed opendev/zone-opendev.org master: Switch lists to resolve to the new Mailman server  https://review.opendev.org/c/opendev/zone-opendev.org/+/86660617:06
opendevreviewJeremy Stanley proposed opendev/zone-opendev.org master: Restore the default TTL to lists  https://review.opendev.org/c/opendev/zone-opendev.org/+/86660717:06
fungiinfra-root: that's ^ the maintenance dns change stack for the opendev.org zone. the first should be approved pretty much asap, i'll wip the others and make a similar series for zuul-ci.org17:07
opendevreviewJeremy Stanley proposed opendev/zone-zuul-ci.org master: Temporarily lower the address TTLs for lists  https://review.opendev.org/c/opendev/zone-zuul-ci.org/+/86660817:11
corvusfungi: where's the etherpad plan?17:13
fungihttps://etherpad.opendev.org/p/mm3migration17:14
fungiif you're concerned about the temporary cname to review02, i have tested that attempting deliveries to it queues up in my mta's deferrals17:15
fungibut i'm open to other similarly simple solutions17:15
corvusi am -- (i didn't remember cname in the original plan; was assuming just a new A record; i'm mentally going through the possibilities now)17:16
fungii can make it an address record for one of our servers instead if that sits better with you17:16
corvusi think it's fine, just working through it :)17:17
fungioh, i see what you mean, we used a/aaaa records in other zones too rather than cname to the server17:17
fungiand yes, i agree cname for mail delivery has traditionally been discouraged17:17
corvusi'm thinking about both things really -- the temporary cname, and the permanent cname after the move.17:18
opendevreviewMerged opendev/system-config master: Create lists.zuul-ci.org on the Mailman v3 server  https://review.opendev.org/c/opendev/system-config/+/86659917:18
corvusfungi: note that some MTAs will literally rewrite the addresses if there is a cname (taking the "canonical name" part literally)17:18
fungii'm perfectly happy to use a/aaaa in those changes, no sweat17:18
corvusso that may be a reason to avoid that in the end-state.17:18
clarkbI've approved the TTL lowering change17:19
clarkbfor lists.opendev.org specifically17:19
fungiyeah the ttl changes are fine regardless of the remaining implementation details. 866608 is the equivalent one for zuul-ci.org btw17:19
clarkb+2 on that one. Will let corvus approve when happy17:20
corvusi don't think a temporary cname should be a big problem (aside from potentially having some weird messages if we happen to get some messages from mtas that perform that rewriting during that time).  maybe an A or MX would avoid that?  but it's pretty unlikely to be a problem.17:20
fungiwell, if we're going to end with a/aaaa then i may as well just stick with those throughout the series for consistency and ease of reasoning17:21
corvuswfm and lets us shut off the "is this CNAME okay?" mental subroutine :)17:21
fungiexactly17:22
opendevreviewMerged opendev/zone-opendev.org master: Temporarily lower the address TTLs for lists  https://review.opendev.org/c/opendev/zone-opendev.org/+/86660417:24
opendevreviewMerged opendev/zone-zuul-ci.org master: Temporarily lower the address TTLs for lists  https://review.opendev.org/c/opendev/zone-zuul-ci.org/+/86660817:27
opendevreviewJeremy Stanley proposed opendev/zone-opendev.org master: Temporarily point lists to review for deferral  https://review.opendev.org/c/opendev/zone-opendev.org/+/86660517:51
opendevreviewJeremy Stanley proposed opendev/zone-opendev.org master: Switch lists to resolve to the new Mailman server  https://review.opendev.org/c/opendev/zone-opendev.org/+/86660617:51
opendevreviewJeremy Stanley proposed opendev/zone-opendev.org master: Restore the default TTL to lists  https://review.opendev.org/c/opendev/zone-opendev.org/+/86660717:51
opendevreviewJeremy Stanley proposed opendev/zone-zuul-ci.org master: Temporarily point lists to review.o.o for deferral  https://review.opendev.org/c/opendev/zone-zuul-ci.org/+/86661317:51
opendevreviewJeremy Stanley proposed opendev/zone-zuul-ci.org master: Switch lists to resolve to the new Mailman server  https://review.opendev.org/c/opendev/zone-zuul-ci.org/+/86661417:51
opendevreviewJeremy Stanley proposed opendev/zone-zuul-ci.org master: Restore the default TTL to lists  https://review.opendev.org/c/opendev/zone-zuul-ci.org/+/86661517:51
fungiinfra-root: those ^ are the maintenance dns stacks for both domains, revised to use exclusively a/aaaa rrs. i'll wip all 6 for now until closer to time to merge17:52
*** dviroel is now known as dviroel|afk17:52
*** jpena is now known as jpena|off17:58
fungii've checked and cleared the moderation queues for the mailing lists i watch over, but will check them again closer to maintenance time just to be sure18:14
fungiincluding all three service-* lists18:14
clarkbfungi: thank you!18:14
fungiinfra-root: i've added links for the dns changes to the migration plan at the top of https://etherpad.opendev.org/p/mm3migration so we have them for easy reference later, and also added a draft status notice for step 6. i'll get working on steps 4 and 5 shortly, but not crossing off step 4 until i see the deploy jobs confirmed and double-check dns resolution myself18:37
fungier, not crossing off step 3 i mean18:37
clarkbthe notice text lgtm18:38
fungistep 5 won't really take all that long for this migration, but saves us a few minutes during the outage18:38
fungifor lists.openstack.org it's critical, since the initial rsync will take hours18:38
fungibut this also gives us a good opportunity to refine the process for the benefit of the remaining migrations next month-ish18:39
clarkb++18:40
fungii've created root screen sessions on lists.openstack.org lists01.opendev.org for use in the preliminary steps and also during the maintenance18:44
fungiinfra-root: any recommendations for speedier dns updates during maintenance? should we temporarily stop the opendev-prod-hourly pipeline so it doesn't block deploy?19:15
fungii'm noticing those dns update changes took ~1.5 hours to deploy after they merged, even though the jobs themselves only ran for 5 minutes19:15
Clark[m]Were they behind the inventory update for extra lists which takes forever? Otherwise I would expect 0-~40 minutes to run.19:17
Clark[m](depending on where the hourly jobs are)19:17
fungioh maybe. faster might still be nice though since it would shorten the maintenance window19:17
Clark[m]One trick is to land the changes before the top of the hour so that they get the semaphore prior to the hourly jobs. We could also maybe land a change to temporarily disable the hourly jobs19:18
fungiat the very least, let's try to avoid approving any system-config changes not related to today's maintenance until after we're done19:18
fungibut yeah, i'm mainly concerned with merging the dns changes that start the maintenance outage, less so with the ones which end it19:19
Clark[m]I think either try to time landing them for just before 20:00 UTC or land a change nowish that disabled the hourly jobs19:20
Clark[m]The disabling will be more reliable19:20
fungiat least my resolver sees a 300-second ttl being served with the a/aaaa rrs for lists.opendev.org and lists.zuul-ci.org now so i'm crossing step #3 off as done19:21
fungiby 19:42z we can expect that all reasonable clients are respecting the new ttl19:22
fungiinfra-root: we're 15 minutes out from the official start of maintenance, so i'm approving the two changes to temporarily switch the lists.opendev.org and lists.zuul-ci.org names to resolve to our gerrit server19:45
clarkbsounds good19:46
fungithe initial rsync of the sites to the new server is done, so we're ready to proceed at the top of the hour19:47
opendevreviewMerged opendev/zone-zuul-ci.org master: Temporarily point lists to review.o.o for deferral  https://review.opendev.org/c/opendev/zone-zuul-ci.org/+/86661319:48
fungii've also sketched out a few of the more important commands for the upcoming steps in the maintenance plan at https://etherpad.opendev.org/p/mm3migration19:48
fungiassuming our dns changes deploy as hoped19:49
fungilooks like the remaining change is taking time to get a node assignment19:50
fungithere it goes19:52
corvusi'm around, along with my sandwich19:52
fungii'll be making pad prik king once this is done19:52
opendevreviewMerged opendev/zone-opendev.org master: Temporarily point lists to review for deferral  https://review.opendev.org/c/opendev/zone-opendev.org/+/86660519:53
fungibut in the meantime i'm jealous of that sandwich19:53
clarkbyall are making me hungry19:53
fungizuul-ci.org dns change already deployed, opendev.org change seems to be waiting behind event processing19:54
clarkbit will enqueue ahead of the hourly jobs though19:55
fungiand there is is19:55
clarkbso should be fine for quick processing19:55
fungiyeah, if it finishes at the predicted time, we can start importing at 5 after the hour19:55
fungi(accounting for the ttl)19:56
clarkbdo we also shutdown the daemons on lists.openstack.org? I guess that doesn't help because exim will accept theemail either way so rely on dns19:56
fungiyes19:56
fungiexactly why we need to wait for dns19:56
fungiwe don't want inbound messages to end up in the exim queue on the old server19:57
fungii'll send the status notice at the top of the hour to make sure it has time to circulate19:57
ianwo/ ... all seems to be going well! :)19:58
fungi#status notice The lists.opendev.org and lists.zuul-ci.org sites will be offline briefly for migration to a new server20:00
opendevstatusfungi: sending notice20:00
-opendevstatus- NOTICE: The lists.opendev.org and lists.zuul-ci.org sites will be offline briefly for migration to a new server20:00
fungi866605 deployed at roughly 19:59 so we can proceed at 20:04 with next steps20:01
fungiianw: yes, we're at step #9 on https://etherpad.opendev.org/p/mm3migration20:02
fungibut we've really not reached any of the hairy parts yet20:03
opendevstatusfungi: finished sending notice20:03
fungiokay, we should be safe to stop the services for those two sites on the old server now20:05
ianwgood to know the "noticeboard push pin" works @ https://fosstodon.org/@opendevinfra/109462853268874327 :)20:05
fungivery nice!20:05
fungiokay, systemd reports mailman-opendev and mailman-zuul are definitely stopped20:07
fungifinal rsync to the new server is done20:08
fungiokay, site copies are moved into the import directory on the new server and recursively chowned to the uid/gid expected by the containers20:10
funginow comes the exciting part: the import process20:10
fungivery minor catch. i had my trailing slashes wrong in rsync. need to move the resulting directories a little. i'll correct in the pad20:11
fungiand done. trying the import again20:13
fungiit's running now20:13
fungiside note, need to decide how to permanently stop those services on the old server after the migration. i guess a change to remove their sites will suffice20:17
clarkbfungi: oh good point. And ya I think you can just remove the content from ansible and then disable the services on the host20:18
clarkbansible should leave it alone at that point20:18
fungii tacked it onto the end of the maintenance plan so we don't forget20:19
fungiopendev's done, no reported errors that i can see. took roughly 6 minutes. starting zuul...20:20
fungitechnically i can do #13 while this is running, so i'll work on that now20:22
fungitest message sent to the incident list and i've received my copy. other subscribers can double-check20:25
fungialso the zuul import finished in roughly 3 minutes20:25
clarkbfungi: I got the test message20:25
fungii'll check that the sites look right with overridden dns resolution20:25
clarkbfungi: the manual injection is so that you're sure exim and mm3 on the new server processed it?20:26
fungicorrect20:26
fungijust want to make sure it's accepting list mail and distributing it before we point the world at it20:26
corvusmsg lgtm20:27
fungiand the fact that subscribers received it means the subscriber list import worked20:27
fungiwith dns resolution overridden locally, browsing https://lists.opendev.org/ and https://lists.zuul-ci.org/ seems to work and i can see archive contents, like https://lists.opendev.org/archives/list/service-announce@lists.opendev.org/thread/HLQ2NDG5BKNM5L7VWAQTWSROW2L6HJQM/ (the maintenance announcement)20:30
fungihttps://lists.zuul-ci.org/archives/list/zuul-discuss@lists.zuul-ci.org/thread/S7AMZ4NXZCVT36NHRJRITPPNCL2L4L23/ too20:30
fungii think this checks out, so we should be able to update dns20:30
clarkbfungi: what about the old archive redirects? (easy to fix if they don't work20:30
fungiyeah, we can worry about that after20:31
clarkbI think you likely can proceed regardless of old archive links20:31
clarkb++20:31
fungiapproved 866606 and 86661420:31
fungithere's another step in the plan for more thorough testing after dns updates, i just wanted a cursory check to decide whether we needed to roll back20:32
fungisimpler to do before people are able to start sending mail to the new server20:33
opendevreviewMerged opendev/zone-opendev.org master: Switch lists to resolve to the new Mailman server  https://review.opendev.org/c/opendev/zone-opendev.org/+/86660620:33
clarkb++20:33
opendevreviewMerged opendev/zone-zuul-ci.org master: Switch lists to resolve to the new Mailman server  https://review.opendev.org/c/opendev/zone-zuul-ci.org/+/86661420:34
fungi5 minutes after those deploy, we can move on to step 1620:34
fungiactually, there was a opendev-prod-hourly buildset underway (done now) but it didn't block deploy for these that i could tell20:36
fungior maybe it was just fortuitous timing combined with some browser lag20:36
clarkbfungi: it finished20:36
clarkbI think this was perfect timing :)20:36
fungiindeed, perhaps so20:36
fungii love it when a plan comes together (too soon to cue the a-team theme music though)20:37
clarkbchecking email it looks like lists01 backups were sorted out too20:37
fungiyes, ianw fixed that before i even found time to investigate, so thanks!20:37
clarkbrelated to ^ borg 2.0 is going to release soon. It is not backward compatbile. Bu I checked and we seem to pin to a 1.x release already so we should be fine20:37
fungiapparently we had the wrong container name in the cronjobs20:38
clarkbaha20:38
clarkbthank you ianw for figuring that out20:38
fungiwhile we're waiting, i'll optimistically start writing our conclusion e-mail as a reply to my original service-announce message, but hold off sending it until testing is done20:40
fungiand that's readied to send once we're clear20:45
fungithe second dns patch just finished, so at 20:50 utc we should be safe to test20:45
clarkbI'm getting the new web ui without an overrides for lists.opendev.org. Looks happy after a quick exploration20:46
corvusi have to wait a whole 10 seconds20:47
corvusan interesting side effect is that some browsers will cache the initial redirect that gerrit performs20:48
clarkbhrm maybe we want to use an IP that doesn't have a web server in the future then? A zuul merger?20:49
corvusso even after dns updated, i needed to type in something other than "lists.opendev.org" in order to actually see the new site20:49
corvusyeah maybe so20:49
fungiagreed20:50
fungiabout not using gerrit for the parking20:50
fungii was able to go to lists.opendev.org fine in my browser20:51
fungibut hopefully if someone went to the url and cached a redirect in that short time, a cache clear will fix it20:51
fungiwe can pick something better for the next time20:51
fungianyway, are things looking generally okay for everyone else?20:52
clarkbfungi: I added another less urgent followup task to the todo list (double checking db backups now that we have content for lists)20:53
corvusfungi: should i "sign up" ?  (or "sign in?")20:53
clarkbfungi: things look ok to me20:53
fungicorvus: sign up20:53
clarkbcorvus: I believe you sign up with the existing email addr that old lists knew you as and it will associate your accounts via that once you confirm the account20:53
fungifor some reason it seemed like an earlier v3 precreated accounts, but the later ones we tested don't. they will, however, automatically associate your subscriber/moderator/owner roles to your account once you confirm it, as long as the address you use matches20:54
fungiand it's server wide (not just site-wide), so once you have an account it's the same across all the sites on that server20:55
fungii mentioned it in the maintenance announcement, but i'm keeping it reiterated in the follow-up i'll send once we're cool with this20:55
clarkbnote I haven't signed up yet. I should do that I guess20:57
fungii signed up on lists.zuul-ci.org and then logged into lists.opendev.org with the same credentials just to test, and it worked20:59
fungithis will become less confusing when we integrate the sso we're working on20:59
clarkbseems to work for me too21:00
fungibut it seems to work as designed21:00
corvusthis all lgtm21:00
clarkbit shows me the lists I own/moderate21:00
fungiobviously we're going to spend a while fiddling with new options in the list configs, but sounds like we're probably good to call it migrated?21:00
clarkbyes, I think you can send that email now21:00
corvusi did a password reset for lists.zuul-ci.org, and the page header said lists.opendev.org.  that's a pretty minor thing i think we can probably ignore.21:01
corvus++ sending email21:01
fungisent21:02
fungiand i seem to have received my copy as a subscriber21:03
clarkbI've received it and it even filtered as expected so the migration didn't break my rules (based on list id iirc)21:03
fungii'm going to set this aside for a bit and cook dinner, then look at writing a change to permanently disable those two sites on the old server so we don't accidentally wind up having them restart and send stale digests or anything21:04
clarkbsounds good. Thank you for all the help building out the config management for this and testing it and building a migration plan and now migrating two sites.21:04
fungii figure the dns ttl cleanup can wait for a day until we're sure we don't need to make any urgend dns updates to these21:04
corvusfungi: it looks great, thanks for all the work!21:04
clarkbfungi: ++21:04
fungithis is still sort of just the beginning, but glad it's gone smoothly!21:05
clarkbI'm going to eat lunch now. One of the things I need to do later today is send a meeting agenda which will help exercise this further21:05
ianwi've got the test email too21:05
corvusi will be sure to "like" your posts :)21:06
corvusoh interesting, i don't see the 'completed' post in the archives yet21:07
corvushttps://lists.opendev.org/archives/list/service-announce@lists.opendev.org/latest21:07
corvusare they cron, or instantaneous?21:08
ianwi haven't received my signup mail; but also after yesterday and spamhaus etc, this could very well be a @redhat.com problem21:10
Clark[m]corvus: They are cron jobs. We had trouble with this in testing iirc21:11
Clark[m]There is a test step that forces a cron run to populate the (empty in testing) archives21:11
corvuswhat's the interval?21:11
Clark[m]That I don't remember it was short enough that sometimes the testing worked and sometimes it didnt21:12
Clark[m]They are adjustable too iirc21:12
corvusianw: i got signup emails instantaneously (literally zero delay)21:12
ianwinteresting, in the exim log on list01, there's @redhat.com mail waiting21:12
ianwSMTP error from remote mail server after RCPT TO:<address@redhat.com>: 451 Internal resource temporarily unavailable - 21:13
ianwhttps://community.mimecast.com/docs/DOC-1369#451 [-x9V7H_7N8iFec1UDFFY6Q.us182]21:13
corvusianw: probably greylisting the new ip21:13
ianw"451 Unable to process connection at this time The Mimecast server is under maximum load."21:13
ianwyeah, that seems likely21:13
corvusianw: (also, next line under that mentions greylisting specifically)21:13
ianwheh, yeah21:14
corvusactually "451 Internal resource temporarily unavailable" from the exim log is the greylisting form of that msg21:14
corvusso i think that's greylisting confirmed21:14
corvus(they seem to use 451 for a lot of stuff and then disambiguate based on the accompanying message)21:15
ianwyeah -- i guess not much we can do about that, possibly try to figure out the corporate it system and get it whitelisted21:15
ianwi think it's likely retries will get mail through faster than that21:16
corvusthat will probably take longer than the typical 15 minutes spent greylisting21:16
corvuspresumably that should lead to an auto-whitelist so that messages don't get greylisted anymore... it would be good to track performance for a bit and make sure that happens21:17
corvusthis will "prime" remote systems and then hopefully the rest of the lists don't have to go through this in the later transition21:17
ianwyep -- and i did get fungi's test mail in a timely fashion21:18
corvuswe might want to "chat" a bit on the incident or discuss lists over the next days21:18
corvusi think we should exchange holiday recipes21:18
Clark[m]fwiw I agree that I don't see the email in the archive. I think there are ~10 minute jobs, hourly jobs and daily jobs. Maybe this is hourly by default which is probably too infrequent? Definitely worth looking at more closely to make sure we understand it21:20
ianw"451 Too many mentions of eggnog"21:20
ianwit's different versions, but visually on https://lists.opendev.org/archives/ there's no left (or right) margin for me.  compared to say https://lists.fedorahosted.org/archives/21:25
ianweverything loads correctly (thought it may have been a missing css file) so may be intended21:25
ianwi think it's intentional ... https://gitlab.com/mailman/hyperkitty/-/merge_requests/39821:31
ianwthe top < navigation button is right on the bit where my pixel 6 pro screen starts to curve around, making it hard to click.  i think all-lists wants some margins -- but it's not our fault.  i can probably file something21:32
Clark[m]Ya we intentionally didn't try a custom theme21:33
clarkblooks like one thing to check re archiving is that it is enabled for the list in the list settings (I would've expected the migration to do this but we shoulddouble check it)21:45
clarkbin theory hyperkitty is functional though otherwise the migration wouldn't have been able to populate the archives21:47
clarkb`docker exec mailman-web ./manage.py runjobs hourly` is what the test job runs to force things to run early and ensure archives (empty ones) are present)21:48
clarkbconfusingly I think hyperkitty stuff operates in the -core container, but -web controls it using an api key21:48
clarkbthe hourly runs should run in ten minutes so that will be a good clue too I guess21:51
clarkbarchived-at is empty in the email we got21:53
clarkbfungi: the sign in page shows other login options too21:54
clarkbfungi: I thought we had disabled that21:54
clarkboh wait that may be my fault it opened the upstream list in tab complete...21:54
clarkbyup my fault we're good for logins21:55
clarkbok archiving should be enabled for service-discuss according to the configuration page for that list21:56
clarkboh but you sent it to service-announce21:56
fungiright21:56
fungibecause... announcement21:56
clarkbya sorry, but it too has the flag set in settings21:56
clarkbthats good implies the migration process handled that for us21:57
* fungi is finished making, eating, and cleaning up from dinner now21:57
clarkbhourly crons should run momentarily and we can recheck and then take it from there I guess21:59
corvusf5 f5 f5 f522:00
fungii'm just about done with monday evening chores that have to get done before it's dark out, so can help look in a moment22:00
*** dviroel|afk is now known as dviroel22:01
clarkblog says it ran22:01
clarkbbut I don't see it so that isn't it I guess22:01
corvusthe mbox file in /var/lib/mailman/web-data/mm2archives/lists.opendev.org/private/service-announce.mbox does not have the new msg22:03
corvus(but that says "mm2" no idea if that's still relevant for hyperkitty?)22:04
clarkb/var/lib/mailman/core/var/logs/mailman.log hs hyperkitty errors22:05
clarkbhttps://paste.opendev.org/show/borF78wGCKTOxv6wGy2N/22:06
corvusso the message failed when being injected into the archives22:07
clarkband then it emits an html page that is fairly large with "This page either doesn't exist, or it moved somewhere else."22:07
clarkbcorvus: yes I think so. Something about the url it is trying to hit to perform that action?22:07
corvusshould it be using an http://127.0.0.1 url for that action?22:08
clarkbcorvus: yes I blieve so22:08
fungii wonder if we're missing some plumbing to the api on the default vhost22:09
clarkbit listens on a special port22:09
clarkbI didn't think external facing wbe server was involved22:09
fungiyeah, 8000/tcp looks like22:09
fungiPage not found: This page either doesn't exist, or it moved somewhere else.22:10
fungithat seems to be coming from postorius22:10
corvusso it's not relying on hostnames for site routing?22:11
clarkbya but is that because he other end sent it a 404 or it sent a 404? I'm having a hard time processing what that log is trying to tell us22:11
corvusmy fuzzy read of that is something like "it thought it was posting to a hyperkitty url and got a 404 back from postorius"22:12
clarkbcorvus: oh hrm22:12
clarkboh you know what22:13
clarkbfungi: the urls changed right?22:13
clarkbit was /hyperkitty but now is /archives22:13
clarkbI suspect ^ is what broke it22:13
clarkbwe might want to flip that back and test?22:14
clarkbalternativesly we edit our docker-compose var to use a different url for hyperkitty22:14
clarkb?22:14
clarkbHYPERKITTY_URL=http://127.0.0.1:8000/hyperkitty <- maybe that needed to be aligned with your routes update that we did to address the django update22:14
clarkbso ya I think I would try that first, switch ti to /archives or whatever the updated url was22:15
corvusclarkb: where is that setting?22:15
corvusfound it22:16
clarkbcorvus: /etc/mailman-compose/docker-compose.yaml22:16
clarkbis hte on host file22:16
corvus(docker-compose)22:16
clarkbhttps://opendev.org/opendev/system-config/src/branch/master/playbooks/roles/mailman3/templates/docker-compose.yaml.j2 ansible side for that file and https://opendev.org/opendev/system-config/src/branch/master/docker/mailman/web/mailman-web/urls.py#L29 is where things got updated22:17
clarkbpreviously that was 'hyperkitty/' I think22:17
clarkbso ya thats my first hunch22:17
clarkbhttps://github.com/maxking/docker-mailman/blob/main/web/mailman-web/urls.py#L29 is what it was set to previously22:18
fungid'oh, yep upstream mailman changed that years ago but the docker images never did. aligned the templates with upstream but it's possible that was missed22:18
clarkbby the way the /var/lib/mailman/web-data/logs/mailmanweb.log has logs showing the urls are 404'ing there in the webserver22:19
fungiit's possible we need to adjust something else in the images22:19
clarkbfungi: in this case we may need to only update docker-compose to match22:19
fungimaybe a git grep for the old url path in the docker repo will turn up something?22:19
fungioh, even better22:20
clarkbbecause email comes in via exim, exim things hit mailman-core, mailman-core hits web :8000 hyperkitty api and then it archives is the flow I think22:20
fungiooh22:20
clarkband that value is configurable via the image already22:20
corvusi agree with clarkb22:20
clarkbso maybe manually update the docker-compoe file and down then up things22:20
clarkb(someone else should do that as I'm not in a great spot to do so myself)22:20
corvusfungi: can you, or should i?22:21
fungicorvus: if you're already right there, please feel free22:22
corvuscan do22:22
fungii'm still digesting the specifics22:22
fungianother e-mail to service incident should help us confirm it's fixed22:22
corvusdone, i'll send an email22:23
clarkbdoes incident archive?22:24
corvusi'm re-logging in with the addr i'm subscribed to that list to check first before i email22:24
clarkb++22:24
corvusokay, i see the (private) archives for that list; last msg is from october, as expected (ie, not fungi's msg from today)22:25
fungisounds right22:25
corvusemail sent22:26
corvusmsg appears in web archive22:26
fungiperfect!22:26
clarkbit has an archived-at header too22:26
clarkbexcellent22:26
fungicorvus: are you also in a position to push a change to gerrit reflecting that adjustment?22:26
corvuson it22:26
fungieven better. i'll start working on the change to remove the config for the lists on the old server so the initscripts won't accidentally restart things in the future22:27
opendevreviewJames E. Blair proposed opendev/system-config master: Update internal hyperkitty URL  https://review.opendev.org/c/opendev/system-config/+/86662922:27
corvusfungi: clarkb ^22:27
clarkb+222:28
fungithanks again!22:30
clarkbfwiw looking in the mailmanweb log is what made me think of that fix. But only because i remember updated the urls for web22:36
clarkbonce I understood the processing flow it made a lot of sense. Speaking of writing up docs of some sort is probably a good next step too?22:36
clarkbWe can document some commands like force running cron jobs and also describe the high level flow that I wrote down above to aid debugging22:36
clarkbthe upstream docs are pretty terse when it comes to this stuff, but their mailing list is pretty responsive22:37
opendevreviewJeremy Stanley proposed opendev/system-config master: Remove opendev and zuul sites from old mm2 server  https://review.opendev.org/c/opendev/system-config/+/86663022:43
fungii think that's ^ the necessary config cleanup22:44
*** dasm is now known as dasm|off22:51
corvusfungi: +2s on that but i did not +w22:55
fungithanks!22:55
clarkbcompletely unrelated: anyone know why the order of pipelines on the zuul status page changed?23:00
clarkbgate is no longer in the center23:00
corvusclarkb: https://review.opendev.org/c/openstack/project-config/+/85997723:04
clarkbaha23:05
corvus(they're listed in definition order)23:05
corvusclarkb: fungi gtema also i'm not 100% sure that trigger config is what you want.  you may want 'approval' instead of 'require-approval'.  https://zuul-ci.org/docs/zuul/latest/drivers/gerrit.html#attr-pipeline.trigger.%3Cgerrit%20source%3E.approval23:08
clarkbhrm ya I suspect so23:09
corvusfungi: 866630 requires an update to testinfra (we have tests asserting the service-discuss list)23:10
ianwsigh, was going to file something about the zero margins on smaller width layouts, gitlab seemed to kick me out when i logged in on my work vm and now i can't seem to get back in at all :/23:13
opendevreviewJeremy Stanley proposed opendev/system-config master: Remove opendev and zuul sites from old mm2 server  https://review.opendev.org/c/opendev/system-config/+/86663023:13
fungicorvus: thanks! just noticed that myself and fix is there ^23:13
clarkbianw: on the ansible installation change is there a reason we can't just say version: '<8' state: latest?23:20
clarkbI think that would have the desired effect?23:20
clarkbI'm wondering if we over thought this problem and overlooked a simple fix23:22
clarkbpersonally, I'd like to avoid using a requirements file like that if we can. There is a lot of indirection going on to make that happen. We set ansible vars which template out a file which then gets installed by virtualenv.23:23
ianwthe other thing i realised is that we also need more complex openstacksdk dependences expressed in the ansible venv too, pinning cinderclient23:25
clarkbianw: looks like the old code already had a state argument we could set. Is that worth trying?23:26
ianwi see where you're coming from, but what was there wasn't very simple either.  i kind of like that the requirements file gives us idempotence, because this runs all the time23:26
opendevreviewMerged opendev/system-config master: Update internal hyperkitty URL  https://review.opendev.org/c/opendev/system-config/+/86662923:27
clarkbthe requirements file isn't going to be more idempotent though23:27
clarkbsince we do a pip install --upgrade it should be the same as state: latest? they'll both update minor ansible releases until we update the cap23:28
ianwit should be, as templating will only return changed if the file updates, so we'll only run pip when we actually change a requirement23:28
ianwjust have to think about the "latest" thing23:28
clarkboh I see you're relying on he file changed result parameter23:28
ianwthe problem is when that is combined with paths installation i think23:28
clarkbso that won't actually update minor ansible versions either23:29
clarkbwe'd have to make a noop edit to the file to trigger those23:29
ianwhrm, true...23:31
ianwthe old idea that we could set our production ansible version to a github tag doesn't work in the world of collections23:32
clarkbbecause the github stuff is like 10 repos now?23:32
ianwa long time ago i feel like we've hit brown-bag ansible issues and had to switch to someones github fork for a little while23:32
ianwbut that won't work in production because it won't have the ansible collections installed23:33
clarkbbut we really only use github today for testing future ansible right?23:33
ianwnot even really; we install zuul's checkout (but yeah, via github)23:34
clarkbianw: looking at the change more closely and thinking about some of the goals above I half wonder if a lockfile/constraints is really what you are hoping to express. That gives you idempotency and when you want to update you edit and you get a new version23:37
clarkbianw: maybe we can express that without extra files? except I think both constraints and the lockfile only work as file inputs ?23:37
*** rlandy|rover is now known as rlandy|out23:37
clarkbthinking out loud here. I don't actually know if that is helpful23:37
ianwhrm, yeah i take your points23:38
ianwI feel like much more removal than addition in the diffstat "95 insertions(+), 198 deletions(-)"  means we're onto something cleaning this up a bit23:39
clarkbya I think the simplification is good. It just seems really odd that we need to write a file to pip install somethign when pip and virtualenv have ansible constructs. However, we break out to shell to do lots of git things in ansible beacuse the git construct isn't very good so maybe that is the case here too23:40
clarkbinfra-root I've just done some quick edits to the meeting agenda. Please add any other edits in he next half hour or so then I'll get that sent out23:40
clarkbianw: looks like install_ansible_ara_callback_plugins.stdout has been used in ansible.cfg.j2 but we didn't set the flag previously/23:50
clarkboh wait I see it now nevermind23:50
ianwyeah, that was a bit confusing.  i updated that comment to hopefully make it clearer23:51
ianwi agree the idepotence is broken with the <8 version specifier23:52
ianwi guess ultimately there's no way around that23:52
ianwyou either check if there's something more recent, or hard-code the version23:52
clarkbya23:52
corvusfungi: a punch-list item for mm:  as a list owner, i just got some bounce messages (which i suspect may have been in response to something recently disloged from a queue since the list in question didn't have a recent email).  the "From:" header on the cover letter from mailman (ie, the message mailman sent to me as the list owner which then had the actual bounce message as an attachement) is from "changeme@example.com".  so i suspect there may23:53
corvusbe a missing setting in there somewhere.23:53
clarkbianw: I've gone aheada nd +2'd it as this does reduce the amount of code and adds more comments which are helpful and I think it will mostly work maybe just not exactly as originally envisioned (the edit requirements thing)23:54
clarkbI acn live with that while we figure out something better if we decide to do that23:54
corvusfungi: i forwarded that message to you personally, as an attachment, so you should be able to see all of that.  if anyone else wants to look into that, let me know and i can forward it to you as well.23:55
corvus(i did a quick check of the settings/bounce processing page in postorius for the list and did not see anything relevent)23:56
clarkbcorvus: fungi  https://github.com/maxking/docker-mailman/blob/d928d36b97fab6fac2a6295ef5822549a68ed0c8/README.md#site-owner23:56
clarkbthats possibly something that just got overlooked as a file to edit23:57
clarkbyup I think that file is just missing23:58
corvusthere's a /var/lib/mailman/core/var/etc/mailman.cfg that is essentially empty except for a comment about it being auto-generated23:58
corvusso probably we copy that file out into system-config and add that setting to it?23:58
corvus(or leave it alone and create mailman-extra.cfg?  i dunno, all new to me)23:59
clarkbcorvus: I think the docker image treats the mailman-ext.cfg special and incorporates it into that file23:59
clarkbits part of the startup routine to do that bit23:59
ianwclarkb: thanks, agree we can probably do even bettter23:59
corvusoh interesting/weird23:59

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!