Monday, 2020-09-28

openstackgerritIan Wienand proposed zuul/zuul-jobs master: update-json-file: add role to combine values into a .json
openstackgerritIan Wienand proposed zuul/zuul-jobs master: ensure-docker: Linaro MTU workaround
openstackgerritDmitriy Rabotyagov (noonedeadpunk) proposed zuul/zuul-jobs master: Make sure that we pass list in loop
openstackgerritMerged zuul/zuul-jobs master: Add ability to use *-docker-image roles in periodic jobs
openstackgerritAlexander Szakaly proposed opendev/git-review master: Fix bug in git_credentials()
openstackgerritzbr proposed opendev/elastic-recheck master: WIP: Run elastic-recheck container
openstackgerritzbr proposed opendev/gerritlib master: Allow custom retries on gerrit connection
openstackgerritzbr proposed opendev/gerritlib master: Bump hacking
zbrclarkb: corvus: please help me with ^ gerritlib changes, should be very easy to review.14:50
openstackgerritzbr proposed zuul/zuul-jobs master: More E208 mode fixes
clarkbinfra-root if we can land I think that is necessary now that we've stopped mirroring 15.115:09
clarkb(I probably should haev set a depends-on for the mirror removal my mistake)15:10
clarkbAlso I have removed my WIP from which removes fedora-30 from nodepool the rest of the way. Haven't seen any issues with fedora-30 being removed from the launchers15:11
openstackgerritzbr proposed zuul/zuul-jobs master: Add file mode to fetch-sphinx-tarball
openstackgerritzbr proposed zuul/zuul-jobs master: Partial address ansible-lint E208
clarkbthe last item on my weekend catch up todo list is restarting zuul-web for the link rendering fix that landed recently15:15
clarkbany objcetions to me doing that after some tea?15:15
zbrclarkb: i was away for one week so i am not sure what was the decision regarding gerritlib and old python. can we drop py27/py35 or not?15:15
clarkbzbr: we still need someone to check aiui15:15
clarkbbasically use codesearch or similar to find jeepyb and gerritlib use then double check again python versions15:16
AJaegerconfig-core, please review
clarkbzuul is fine and I think manage-projects is fine now as well (it runs in the gerrit container) but we should check that15:16
zbrclarkb: technically we could drop it without worry by updating python-requires.15:16
zbrand removing universal wheel15:17
clarkbzbr: but then you don't get bug fixes and I think we want to move thoseold python2 things to python315:17
zbrwe can create maintenance branch, *if* needed (fix needed + easier to patch than to switch python on that consumer).15:18
clarkbor we can just address the consumption side which may already be done (needs checking)15:18
clarkbzuul, e-r, and manage-projects are 3 consumers off the top of my head and they all use python3 now I think. We just need to confirm there isn't other use of it and if there is ensure it is python315:19
zbrgermqtt is another one15:20
clarkbI wonder if we should turn off germqtt and firehose15:21
clarkbfungi: ^ you may have thoughts15:21
zbr reports only these.15:21
fungiyeah, looks like puppet is currently installing germqtt onto the system with pip for python2.715:23
fungithe server is running ubuntu 16.04 lts which has python3.5 installed15:23
zbrdoes it install gerritlib from git or wheel? if wheel we have no reason to worry.15:24
zbronce we drop support for old pythons, the new wheel will not get picked by pip from these systems.15:24
fungithe last change to merge through zuul for germqtt was which was tested against python3.5 so presumably would work with it15:25
fungiit's using gerritlib 0.6.015:25
fungiso a ways behind on that anyway15:26
fungi0.6.0 is a little over 4 years old15:26
clarkbwell I wonder if anyone is using germqtt and the last person who tried failed15:27
clarkb*last known person15:27
clarkb(and firehose in general)15:27
fungii agree that it's not going to break from dropping old python support even if we release new versions of gerritlib, unless we want to redeploy it and use the same configuration management15:27
clarkbafter EOLing ask.o.o I'm more brave about turning other things that are unused or underused or unmaintainable off15:28
clarkbpbx is also on that list and we should probably just disable it15:28
fungiyeah, while firehose was a cool idea and a good proof of concept, i'm not convinced it's getting sufficient use to warrant continued maintenance15:29
fungimtreinish and i will need to raise a glass in its memory15:29
fungiclarkb: while you're looking at gerritlib changes, any input on 750849?15:30
fungithat's done with the goal of supporting configurable ssh keepalive in gerritbot15:32
openstackgerritMerged openstack/project-config master: Remove fedora-30 from the nodepool builders
clarkbfungi: zbr ya that change lgtm I think it conflicts with zbr's hacking and reconnection setting changes15:33
clarkbdo we have a preference for landing order?15:33
clarkb(I didn't approve 750849 yet but I think we can if we want to land it first)15:33
fungigetting the attention of the submitter for 750849 to update it for a merge conflict may be harder since they're not a regular, but we could also update it for them15:34
fungiso i don't really have an ordering preference15:35
clarkbgood point, I've approved it given ^15:36
openstackgerritzbr proposed opendev/gerritlib master: Refreshed linters calling
mtreinishfungi: :( oh well. Yeah I agree it probably isn't super widely used especially since it apparently wasn't working for a long time15:40
mtreinishfwiw, I don't think there is anything in germqtt that is incompatible with newer python versions. It's pretty simple15:41
clarkbya most of our uplifting of python versions has been straightforward. Typically a print foo or a bytes vs str mismatch.15:42
clarkbJust at a higher level we have some services like pbx, survey, firehose, etc that are not well used either becuase there are other tools people are using or they just haven't found need for them and maybe we need to turn those off and reduce our overhead15:43
mtreinishlol, well I just ran tox -epy38 on the repo HEAD and the single unittest passed :)15:43
mtreinishoh sure I agree, I don't think we need to keep running firehose if there aren't any users.15:44
fungii was hopeful, and we gave it a solid 4 years for folks to find stuff to use it for15:45
openstackgerritMerged opendev/gerritlib master: Add support for TCP keep alive in gerritlib
openstackgerritzbr proposed opendev/gerritlib master: Decouple linters from unit testing
openstackgerritzbr proposed opendev/gerritlib master: Allow custom retries on gerrit connection
fungiuh oh, looks like afsdb02 is offline/hung, which probably explains the "u: no quorum elected" errors thrown when attempting to vos release stuff15:52
fungii'll see if i can get anything useful from its console and then probably hard reboot i15:52
clarkbfungi: :/ let me know if I can help with the recovery15:53
fungiwill do15:54
clarkb(next on my list is cleaning up the fedora-30 and atomic content so working afs is useful :) )15:54
openstackgerritzbr proposed opendev/gerritlib master: Make py36 minimal supported version
fungihung kernel tasks reported to the console ~48543960s after boot15:59
fungino login prompt15:59
fungiunresponsive to carriage return15:59
fungi#status log hard rebooted afsdb02 via nova api following hung kernel tasks around 13:10 utc16:02
openstackstatusfungi: finished logging16:03
fungiwatching log for the volume releases which are about to kick off again in a few seconds16:04
zbrclarkb: should I also update metadata to point to service-discuss for gerritlib? (likely we need to do it for other packages too)16:05
fungivos release still failing, but much more quickly now16:05
clarkbfungi: the lock may be stale if there were previous releases?16:05
clarkbzbr: yes I think that is fine16:05
fungiclarkb: there's still a quorum issue, so i probably need to do something to get the databases back in sync between afsdb01 and 0216:06
fungibut now instead of spending 5 minutes trying to get a response from afsdb02, commands are returning the error immediately instead16:07
fungiso... progress16:07
clarkbopenafs docs seem to imply bos managed services are responsible for the db services.16:08
clarkband ubik is the sync tool?16:08
openstackgerritzbr proposed opendev/gerritlib master: Fix package metadata
fungithis is stunning:
clarkbhuh maybe that one needs a reboot too :)16:09
fungia few days ago the load average began rising linearly, topped out at 500 before it died16:09
fungithat's the one i rebooted16:09
clarkbah I thought it was the fileserver that rebooted not the db. got it16:10
clarkbdouble check that bos services are running on it after reboot and if so I wonder if we can force a sync from db0116:10
fungiyeah, we have two db servers, one in dfw and a second in ord. the latter is the one which hung16:10
*** tosky has quit IRC16:12
fungii'll see where gets me to16:13
fungiokay, after manually running a bos shutdown for afsdb02 and starting the services again, the new vos release pulse seems to be doing something (at least not erroring out immediately like it was after the server reboot)16:26
openstackgerritzbr proposed openstack/project-config master: Drop old python jobs from gerritlib
openstackgerritzbr proposed opendev/gerritlib master: Make py36 minimal supported version
openstackgerritzbr proposed openstack/project-config master: Move gerritlib jobs to in-repo config
openstackgerritzbr proposed opendev/gerritlib master: Make py36 minimal supported version
openstackgerritzbr proposed opendev/gerritlib master: Fix package metadata
clarkbfungi: looks like the opensuse 15.1 cleanup hasn't happened yet on afs likely due to that same issue16:34
clarkb that should all go away on the next run I think (I'll watch it)16:35
fungiclarkb: yeah, i just manually ran a vos release for project.tarballs and it worked16:35
fungithe cronjob on mirror-update on the other hand reported an issue with the docs volume16:36
fungithough that may have been due to a concurrent vos release run now that i look closer16:36
fungiyeah, that's it16:37
clarkbI think the suse pulse will run at 18:07 UTC which is about 1.5 hours from now. I'm not in a huge rush so that should be fine16:37
fungithere's a vos release of the docs volume underway since 16:25 which will presumably complete soon16:37
clarkb is a related change for fedora-3016:37
fungisome volume releases are probably going to take a while due to the opensuse and fedora cleanups16:38
fungiand also just general churn from not having been able to successfully perform vos release for a little while16:38
clarkbhowever once we get the fedora and opensuse cleanups done that should make things happier in the long run as there will be less overall churn16:39
fungiwell, maybe. it's not like those trees were probably seeing any actual changes16:39
openstackgerritMerged openstack/diskimage-builder master: Test opensuse 15 builds with 15.2
fungiseems to have gotten past the docs volume16:42
fungithings are updating correctly looks like16:42
fungiokay, vos release runs seem to have all completed for the static content volumes, so that's back on track16:50
fungimirror volumes may still take a while, not sure16:50
clarkbfungi: when we dropped opensuse 42.3 it was surprisingly fast. I think rm's are quick16:50
clarkball it has to sync is the stat chagnes not the contents so is fast?16:51
fungiahh, good point16:51
fungiso maybe it's already done, or will be at the next mirror update pulse at 18:00ish16:51
openstackgerritzbr proposed opendev/gerritlib master: Make py36 minimal supported version
fungii need to step away and knead pizza dough for tonight now that afs is back under control16:52
fungiwill be back shortly16:52
clarkbfungi: I'll restart zuul web when you return. Also now I want pizza, I think our plan is tacos for lunch17:00
zbrclarkb: did still run py27/py35 jobs regardless if we had depends-on to remove them from project-config.17:11
clarkbzbr: project-config is a trusted project in zuul so changes have to be merged to it before they apply17:11
clarkbthis is why it is a good idea to keep as little as possible in that repo17:12
zbrso time to +W that change and do a recheck17:12
clarkbfungi: are you back now? is it a good time for me to restart zuul-web?17:24
fungiyep, please feel free to proceed17:29
clarkbthe links to post job urls seem to be there as expected and status page is loading for me17:30
clarkb#status Log redeployed zuul-web to pick up javascript updates for proper external link rendering on builds and buildsets17:30
openstackstatusclarkb: finished logging17:30
fungiyeah, looks fin17:43
clarkb has been cleaned up, still waiting on
openstackgerritMerged openstack/project-config master: Move gerritlib jobs to in-repo config
openstackgerritSean McGinnis proposed openstack/project-config master: Complete retirement of x/osops-* repos
AJaegersmcginnis: thanks - I left some comments to cleanup oyur change ^18:59
openstackgerritSean McGinnis proposed openstack/project-config master: Complete retirement of x/osops-* repos
clarkbAJaeger: see comment on
openstackgerritMerged zuul/zuul-jobs master: Allow skip files when download logs
openstackgerritTristan Cacqueray proposed zuul/zuul-jobs master: Add ensure pre-run policy to ansible-lint
AJaegerthanks, clarkb19:20
openstackgerritSean McGinnis proposed openstack/project-config master: Complete retirement of x/osops-* repos
clarkbfor some reason I thought had landed but I Think I saw the devstack fedora-30 removal and confused the two19:26
clarkbbut that explains why fedora-30 hasn't gone from our mirror yet :)19:27
donnydI haven't seen any post failure complaints lately - so I am thinking OE should be fixed at this point19:44
donnydstill working the monitoring solution19:44
clarkbI'm debugging why my touchpad doesn't work after resuming from suspend and arch wiki has a whole section on it
clarkbI gave up and rebooted20:37
openstackgerritAndrii Ostapenko proposed zuul/zuul-jobs master: Fix promote cleanup
fungifwiw the touchscreen driver for these gpd pocket portables still has get an rmmod/modprobe after hibernating to disk (thankfully not after suspend to ram)22:04
clarkbI didn't see a distinct driver for this touchpad which made me think it is just usbhid and unloading that would be bad :)22:08
ianwclarkb: next you can fix why my external monitor goes crazy randomly22:17
ianwthe dmesg gets you to a bug that's been open for about 8 years22:17
fungii do occasionally have to ssh into my workstation from another device and xset dpms force off/on22:20
fungiremind me why we put up with computers?22:20
ianwfungi: if you have a sec for the gitea apache proxy fix @ that would be good22:25
fungii'm in the middle of some unlicensed kitchen plumbing i didn't plan for, but will try to take a look once i come out the other side22:26
clarkbfungi: ianw any chnace we can land in the next little bit?22:34
clarkbthats the last major todo on my nodepool cleanup thread22:34
clarkb(I also need to look at the atomic image cleanup but was hoping for more feedback on the mailing list first)22:35
ianwlgtm but may need to run manually if the rm takes a while22:35
clarkbthe rm for opensuse 42.3 was super quick22:36
clarkb(it surprised me) so I am hopeful this one will be too22:36
fungiclarkb: there was also some followup in #openstack-containers22:37
fungishort answer is that it's used for testing magnum up through train (though train did introduce support for an alternative)22:37
funginot used for testing magnum ussuri and later22:37
clarkbfungi: ya I looked at the stable branches and for rockyish and back its fedora 26?22:38
clarkbits only a couple branches that use the image we have published today22:38
clarkbI guess the question then is can we delete it like we did for the other really old branches (and irc logs say maybe yes)22:38
openstackgerritIan Wienand proposed zuul/zuul-jobs master: Make sure that we pass list in loop
fungifrequency of stable branch patches for magnum means gains from mirroring that are probably ~nonexistent22:45
openstackgerritMerged opendev/system-config master: Stop mirroring fedora 30
openstackgerritClark Boylan proposed opendev/system-config master: Mirror the only Fedora Atomic image used by Magnum
clarkbfungi: ^ that is a safe first step I think. Basically we stop mirroring everything else and then rm the everything else23:13
openstackgerritIan Wienand proposed zuul/zuul-jobs master: update-json-file: add role to combine values into a .json
openstackgerritIan Wienand proposed zuul/zuul-jobs master: ensure-docker: Linaro MTU workaround
