Thursday, 2023-02-09

tonybLooks like gitea08 is fine now?00:33
fungilast i checked00:41
tonyb\o/00:42
tonybWas it the VM or some other "bad actor" ?00:42
fungithough more generally, it seems like some traffic through zayo london and atlanta may be having trouble getting to vexxhost sjc1 where the gitea servers are00:42
tonybare they all in the same DC?00:44
* tonyb doesn't go through London ;P https://paste.opendev.org/show/b8t6OfqPu1wzc374ikiR/00:46
fungithe gitea servers? yeah all in the same vexxhost cloud region00:48
fungithe gerrit server is in a different vexxhost region (ca-ymq-1)00:48
ianwall the giteas have also been rebooted in the last hour or so, just incase00:48
fungioh, i missed that00:48
ianwfungi: heh, that was the idea -- i downed them in the lb and updated docker and rebooted00:49
fungino new containers though00:49
ianwi'll quickly do the lb later when it's quiet00:49
fungistill doesn't have the donor logos00:49
fungibut we can wait for something else to merge to trigger another image pull on them00:49
ianwoh, sorry i didn't pull, it was just getting the new daemon00:49
fungino worries, it's not pressing00:49
fungiwas just curious to see them in production if they actually were00:50
ianwi have the up down scripted, i can run it again with a pull00:53
tonybOkay, FWIW the problems I was seeing were that origin/https/gitea? were slow but gerrit/ssh were fine ... and for a while origin was behind gerrit (for >15mins) in terms of visible commits00:53
fungithat could be consistent with the slow transfer speeds and very high latency i was seeing through some parts of the zayo backbone00:59
tonybPossibly (not that I use/transit Zayo?), it's all fixed now so. yay01:00
fungii'm seeing fairly decent bandwith between gerrit and the vexxhost sjc1 network at the moment though, just testing http transfer of a test file01:01
tonybYeah it's all good for me right now01:02
ianwfungi: ok, the logos are there now :)01:04
fungithanks!01:07
fungihttps://opendev.org/#cloud-donors01:17
tonybnice01:18
tonybI have a workaround but I'm seeing gitea08 as slow again02:15
tonybfor i in 1 8 ; do time curl -s https://gitea0$i.opendev.org:3081/openstack/governance>/dev/null; done02:16
tonybreal0m0.919s02:16
tonybreal3m28.998s02:16
opendevreviewIan Wienand proposed opendev/system-config master: gerrit: increase ssh channel debugging  https://review.opendev.org/c/opendev/system-config/+/87321402:18
opendevreviewIan Wienand proposed opendev/system-config master: gerrit: increase ssh channel debugging  https://review.opendev.org/c/opendev/system-config/+/87321402:54
ianwload average: 1.16, 2.13, 22.3803:02
ianwit's busy but not crazy03:02
ianwthe first half of the executors have updated docker and rebooted03:09
*** yadnesh|away is now known as yadnesh04:04
ianwclarkb: looks like some types changed which breaks the master build with the ssh debug patch (https://gerrit.googlesource.com/gerrit/+/4f7ef7d590ba99342948fe6c26d24e4c3cd7003d%5E%21/#F0)05:27
ianwsince i'd only propose we keep this on for a few days, until we catch an error or two, i'd propose we just leave the master build as broken for that period.  if upstream wants to take it, obviously we can port it properly in gerrit and problem solved05:28
ianw(i'm talking about https://zuul.opendev.org/t/openstack/build/a5636f778eea4fd18fff6b80af17b884/console)05:33
*** jpena|off is now known as jpena08:27
*** ysandeep is now known as ysandeep|afk09:52
*** ysandeep|afk is now known as ysandeep10:53
opendevreviewyatin proposed openstack/project-config master: Cache Cirros 0.6.1 images  https://review.opendev.org/c/openstack/project-config/+/87324411:36
*** ysandeep is now known as ysandeep|afk11:55
*** ysandeep|afk is now known as ysandeep12:31
*** ysandeep is now known as ysandeep|afk13:21
*** dasm|offline is now known as dasm|rover13:47
opendevreviewRon Stone proposed openstack/project-config master: Update StarlingX docs promote job for R8 release  https://review.opendev.org/c/openstack/project-config/+/87326614:22
fungislittle1_: if you're good with ron's https://review.opendev.org/873266 i'll go ahead and approve it14:53
fungislittle1: ^ (i guess you're in here twice at the moment)14:54
*** yadnesh is now known as yadnesh|away14:58
*** ysandeep|afk is now known as ysandeep|PTO15:25
*** dasm|rover is now known as dasm|bbiab15:25
clarkbianw: ya we can break the master gerrit builds. I need to review your change. I'll try to get to that this morning. There was a zuul bugfix too15:44
fungii should have waited another day to do my python recompiles. new 3.10, 3.11 and 3.12 prerelease last night16:13
* fungi sighs16:13
*** dasm|bbiab is now known as dasm|rover16:28
clarkbfungi: at least in the winter the extra heat output isn't so bad17:18
fungitoo true17:21
*** jpena is now known as jpena|off17:33
opendevreviewAde Lee proposed zuul/zuul-jobs master: Add ubuntu to enable-fips role  https://review.opendev.org/c/zuul/zuul-jobs/+/86688118:08
clarkbI think one of our swift job log endpoint targets is having a sad resulting in POST_FAILURES for jobs. THe first one I've tracked down was for rax dfw19:01
clarkbtrying to look at a few more to see if it is more widespread than that19:01
opendevreviewJeremy Stanley proposed zuul/zuul-jobs master: Add ubuntu to enable-fips role  https://review.opendev.org/c/zuul/zuul-jobs/+/86688119:02
clarkbnow up to two failing in rax dfw19:02
clarkblet me check one more and it isn't also in another region I'll push a change for that region19:02
fungithanks19:04
fungistanding by to approve. i guess we may need to bypass zuul and submit it directly in gerrit?19:04
clarkbthird one is in rax_iad19:04
clarkbso maybe this is rackspace wide19:04
fungijust do all three19:04
clarkback incoming19:04
opendevreviewClark Boylan proposed opendev/base-jobs master: Temporarily disable rax swift log uploads  https://review.opendev.org/c/opendev/base-jobs/+/87332019:06
clarkbthere is a decent chance that will need to be force merged but we can try to do it normally first19:06
clarkbfungi: ^19:08
fungiclarkb: is commenting safe there? i thought ansible got cranky if you had a ' in a comment19:09
fungibut maybe it's only if the ' is mismatched?19:10
clarkbfungi: it was only if unmatched19:10
clarkbso this should be fine19:10
fungii'll just bypass zuul in that case, thanks19:11
opendevreviewMerged opendev/base-jobs master: Temporarily disable rax swift log uploads  https://review.opendev.org/c/opendev/base-jobs/+/87332019:11
opendevreviewMerged zuul/zuul-jobs master: Add ubuntu to enable-fips role  https://review.opendev.org/c/zuul/zuul-jobs/+/86688119:11
clarkbI'm not seeing anything on their cloud status page that would indicate this is a known problem yet19:14
fungi#status log Bypassed testing to merge urgent change 873320 temporarily disabling uploads to one of our log storage providers which is exhibiting problems19:14
opendevstatusfungi: finished logging19:14
clarkbdon't see email either so this is probably the leading edge of whatever it is and hopefull we'll have better indication of what is going on in a bit19:15
clarkbfungi: also 873320 threaded the needle huh?19:16
clarkber sorry I mean 86688119:16
fungiapparently19:17
fungii've approved 872222 now since it was waiting on that19:17
fungiand then we can recheck and hopefully approve 87222319:17
fungiand then start trying to exercise it and find out what else is still broken19:18
opendevreviewMerged openstack/project-config master: Add base openstack FIPS job  https://review.opendev.org/c/openstack/project-config/+/87222219:22
opendevreviewClark Boylan proposed opendev/base-jobs master: Update base-test to only upload to rax  https://review.opendev.org/c/opendev/base-jobs/+/87332119:23
clarkbfungi: ^ landing that willallow us to check if rax is working via the base-test job more easily19:23
ianwok zuul01/02 are the last two on the docker upgrade list19:38
fungiother than the list servers which i haven't finished yet19:39
ianwoh right; yep and lists01 -- the others aren't dockerised right?19:39
fungiright19:40
ianwfungi: if down/up is reliable i'm happy to do it -- just didn't want to step on anything in progress19:40
fungii think it probably is, i believe the failure i hit on the held node was unrelated to the docker package upgrade and instead something incomplete with the domain filtering config19:42
ianwok, i'll try ...19:43
opendevreviewMerged opendev/base-jobs master: Update base-test to only upload to rax  https://review.opendev.org/c/opendev/base-jobs/+/87332119:43
fungii guess go for it, worst case i jump into troubleshooting a config problem with the listserv but brief outages there aren't a disaster19:43
fungithanks!19:44
ianwdoing a quick reboot as it pulled  a new kernel, it hadn't been up that long though19:46
fungicool19:46
clarkbre zuul01/02 those should be straightforward as well19:47
clarkbhttps://review.opendev.org/c/zuul/zuul-jobs/+/680178 has been recheck to gather an intiail set of info on the rax failures19:48
clarkbreally its just going to say "is this still happening or not" and thenwe can debug from there.19:48
ianwok; https://lists.opendev.org/mailman3/lists/ seems up, the containers don't have anything crazy in logs19:49
fungithanks!19:52
ianwit does say 19:53
ianw"Your models in app(s): 'django_mailman3', 'hyperkitty', 'postorius' have changes that are not yet reflected in a migration, and so won't be applied."19:53
ianwdo we need to manually apply migrations?19:53
fungii thought i saw the compose running it at each start19:54
ianwhrm it does give conflicting information19:55
ianwOperations to perform:19:55
ianw  Apply all migrations: account, admin, auth, contenttypes, django_mailman3, django_q, hyperkitty, postorius, sessions, sites, socialaccount19:55
ianwRunning migrations:19:55
ianw  No migrations to apply.19:55
ianw  Your models in app(s): 'django_mailman3', 'hyperkitty', 'postorius' have changes that are not yet reflected in a migration, and so won't be applied.19:55
ianwthat's the full output19:55
ianwi guess it's looking for a migration.py file to apply, and can't find one, but seems to know the models might need it19:56
ianwplaying taxi, bib19:57
ianwso the entrypoint does run migrate -> https://github.com/maxking/docker-mailman/blob/main/web/docker-entrypoint.sh#L12620:52
ianwbut not makemigrations afaics.  it does seem that is something that needs to be done as new versions are included20:53
clarkbthe dnm base-test change ran 3 rax ord uploads and one each for dfw and iad. All were successful. I've rechecked again to gather more data, but its possible this blip was hsort lived?20:54
fungiianw: interesting, we haven't actually upgraded mailman services since the initial deployment, so i wonder what we're lacking20:54
ianwfungi: yeah, i'm thinking run makemigrations and see what it spits out20:54
clarkbianw: note we're on a fork, but one that tries to be close to upstream20:55
clarkber of the docker images I mean20:55
ianwhere's what it thinks is different20:56
ianwhttps://paste.opendev.org/show/bWnwl4DPOryqpGVNmbml/20:56
clarkbwait so the application doesn't actually ship with proper migrations? I find django db stuff very confusing20:58
ianwyeah, i'm wondering how much of /usr/lib/python3.10/site-packages/django_mailman3/migrations/ is in the upstream container21:00
ianwi think that all these are changing an "id" field21:04
ianwperhaps to a models.BigAutoField21:05
ianw# Explicitly set default auto field type to avoid migrations in Django 3.2+21:06
ianw    default_auto_field = 'django.db.models.AutoField'21:06
ianw^ other projects set that21:06
clarkboh maybe because I set the value in our settings file to make a warning about it go away21:06
clarkb?21:06
clarkbthough I thought i set it to the default21:06
ianwclarkb: hrm, i don't see anything related to this in our settings files21:08
clarkblet me see if I can find it.21:08
clarkbianw: opendev/system-config/playbooks/roles/mailman3/files/web-settings_local.py:DEFAULT_AUTO_FIELD = 'django.db.models.BigAutoField'21:09
ianwoh right, AUTO_FIELD not auto_field :)21:09
clarkbin particular the warning we got seemed to say big auto filed was the default so I set it to that iirc. But maybe it isn't the default and normal size is and now it wants to migrate us to the big size?21:10
ianwit looks like it changed in django 3.2+21:13
clarkbianw: not to distract from django, but looking at your java change to gerrit the change looks good. Thomas had a suggestion for an additional logging point though21:23
clarkbthat said, I'm looking at the log files from our system-config-run 3.6 job and not seeing where those log lines end up21:25
clarkbmight be a good idea to ensure we're writing those logs in the test job so we can confirm they are working before pushing to prod?21:26
ianwclarkb: so we need to turn up the logging flag before that will show21:27
clarkbaha21:27
clarkbianw: seems like we should do that in CI anyway? would proably be useful in general there21:27
ianwyes, i can look at it.  either via config, or you can do it on the live system with the gerrit log-level (i think, smoething like that)21:27
clarkband ya no real concern from me to temporarily break the master builds. I think we may have done in the past anyway21:29
opendevreviewIan Wienand proposed opendev/system-config master: mailman: set web auto field size to "AutoField"  https://review.opendev.org/c/opendev/system-config/+/87333721:31
ianwnow I've run makemigrations -- i didn't realise it would write to /usr.  i want to be careful to *not* apply that.  so I think i'll down the containers, which should delete them, and then up so they come up with only the migrations in the actual container image21:32
ianwi don't want it to somehow restart and apply the migration to upgrade all the id's21:33
ianwdoing school run v2 ... bib21:33
clarkbmakes sense especially if we're switching to the current value21:34
clarkbon the second pass of my dnm log check change all 5 uploaded to dfw22:16
clarkbI'm going to push up reverts now I guess and we can decide if we think this is sufficient checking22:16
*** dasm|rover is now known as dasm|off22:18
opendevreviewClark Boylan proposed opendev/base-jobs master: Revert "Temporarily disable rax swift log uploads"  https://review.opendev.org/c/opendev/base-jobs/+/87333922:18
ianwlgtm; we can always turn it off again22:22
opendevreviewMerged opendev/base-jobs master: Revert "Temporarily disable rax swift log uploads"  https://review.opendev.org/c/opendev/base-jobs/+/87333922:29
ianwi made a merge request about the key type -> https://gitlab.com/mailman/django-mailman3/-/merge_requests/189 ... see what happpens22:31
ianwi've restarted mailman containers again.  so it still has that warning, but we know what's going on now22:32
ianwi see that update on the gerrit change, i'll take a look22:39
opendevreviewIan Wienand proposed opendev/system-config master: gerrit: increase ssh channel debugging  https://review.opendev.org/c/opendev/system-config/+/87321423:29
ianw^ that should set the logging flag.  i couldn't see an api way to do that?23:30
ianwi mean http api, i guess the cmd-line is an api of sorts23:30

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!