Tuesday, 2020-10-13

*** Tengu_ has joined #opendev01:16
openstackgerritIan Wienand proposed opendev/system-config master: [wip] reprepro  https://review.opendev.org/75766001:16
*** Tengu has quit IRC01:19
*** Tengu has joined #opendev01:21
*** Tengu_ has quit IRC01:22
ianwAnsibleUndefinedVariable: the inline if-expression on line 41 evaluated to false and no else section was defined.02:08
ianw{% for item in borg_backup_dirs + borg_backup_dirs_extra -%}02:08
ianw    {{ item }} {{ '\\' if not loop.last }}02:08
ianwthat is line 4102:08
ianwit doesn't fail in gate, but is somehow failing on bridge?02:08
ianw - debug:02:35
ianw        msg: '{% for item in [1,2,3] %} {{ item }} {{ "," if not loop.last }} {% endfor %}'02:35
ianwfails on bridge, but i can't make it fail anywhere else ...02:35
clarkbanaible version maybe?02:40
ianwrunning that out of a 2.9.8 virtualenv *on* bridge also works02:41
ianwbridge has Jinja2              2.1002:42
ianwthe venv has Jinja2       2.11.202:43
ianwand if i downgrade the jinja2 in the virtualenv, it fails02:43
ianwso, why isn't the bridge jinja updating i guess is the question02:44
*** dviroel has quit IRC02:44
fungipip doesn't normally upgrade dependencies if they already satisfy the minimum version02:45
ianwyeah, i guess so.  i'm surprised ansible doesn't pin itself to a jinja version02:45
fungithere is an alternative upgrade strategy you can specify, but it will wreak havoc if you're installing some python libs from distro packages02:46
fungiwhich is bound to happen if you install some python applications from distro packages02:47
ianwi guess the solution here is "run ansible out of a virtualenv"02:47
ianwor a containered ansible like openstackclient i guess02:48
fungior run ansible from a distro package02:48
fungibut yeah02:48
ianwi think for now i might just manually upgrade jinja2 on bridge02:49
*** ysandeep|away is now known as ysandeep03:13
ianwansible is avoiding pinning to avoid i guess having any annoying version dependencies for stable distros03:27
ianwbut, that also means you get annoying incompatible behaviour depending on your environment.  so you can't have it both ways03:27
*** ysandeep is now known as ysandeep|afk03:47
openstackgerritIan Wienand proposed opendev/system-config master: [wip] install ansible in a virtualenv on bridge  https://review.opendev.org/75767004:07
ianwhttps://api.us-east.open-edge.io:8080/swift/v1/AUTH_e02c11e4e2c24efc98022353c88ab506/zuul_opendev_logs_2f2/722148/9/check/dib-nodepool-functional-openstack-ubuntu-focal-containerfile-src/2f25c7d/zuul-manifest.json04:14
ianwis it just me, firefox is blocking this as a "deceptive site"?04:14
ianwhttps://transparencyreport.google.com/safe-browsing/search?url=https:%2F%2Fapi.us-east.open-edge.io:8080%2Fswift%2Fv1%2FAUTH_e02c11e4e2c24efc98022353c88ab506%2Fzuul_opendev_logs_2f2%2F722148%2F9%2Fcheck%2Fdib-nodepool-functional-openstack-ubuntu-focal-containerfile-src%2F2f25c7d%2Fzuul-manifest.json04:15
ianwdoes seem to suggest it is not just me04:15
openstackgerritIan Wienand proposed opendev/system-config master: [wip] reprepro  https://review.opendev.org/75766004:23
*** ykarel|away has joined #opendev04:34
*** ykarel|away is now known as ykarel04:44
fungidonnyd: "open-edge.io has been reported as a deceptive site. You can report a detection problem or ignore the risk and go to this unsafe site."04:46
fungi"The site https://open-edge.io/ contains harmful content, including pages that: Try to trick visitors into sharing personal info or downloading software"04:46
fungii reported a detection problem, no idea what it takes to get delisted though04:50
*** ykarel_ has joined #opendev05:03
*** ykarel has quit IRC05:04
openstackgerritIan Wienand proposed opendev/system-config master: [wip] reprepro  https://review.opendev.org/75766005:07
openstackgerritIan Wienand proposed opendev/system-config master: [wip] install ansible in a virtualenv on bridge  https://review.opendev.org/75767005:21
*** tkajinam has quit IRC05:21
*** tkajinam has joined #opendev05:22
*** ykarel_ is now known as ykarel05:22
openstackgerritIan Wienand proposed opendev/system-config master: [wip] install ansible in a virtualenv on bridge  https://review.opendev.org/75767005:39
*** ysandeep|afk is now known as ysandeep05:43
openstackgerritlikui proposed openstack/diskimage-builder master: replace imp module  https://review.opendev.org/75123606:04
cgoncalvesapi.us-east.open-edge.io is being flagged as phishing by Google Chrome: https://snipboard.io/Tyl5kR.jpg06:07
*** elod_pto is now known as elod06:38
*** eolivare has joined #opendev06:40
openstackgerritIan Wienand proposed opendev/system-config master: [wip] reprepro  https://review.opendev.org/75766006:41
ykareli see similar on firefox ^06:48
ykarelDeceptive site ahead06:48
ykarelFirefox blocked this page because it may trick you into doing something dangerous like installing software or revealing personal information like passwords or credit cards.06:48
ykarelfor https://api.us-east.open-edge.io:8080/swift/v1/AUTH_e02c11e4e2c24efc98022353c88ab506/zuul_opendev_logs_c18/757605/1/check/tripleo-ci-centos-8-scenario003-standalone/c1864e3/06:48
ykarelsame on chrome06:49
ianwyeah, i'm not sure what to do but report is as a false negative06:49
ianwthere's no indication *why* it thinks this06:50
ianwdonnyd: ^ perhaps a recycled IP somehow?06:50
jrosserhttps://developers.google.com/web/fundamentals/security/hacked/request_review06:52
*** hashar has joined #opendev06:52
jrosserthe security console there will say why it's done this06:52
*** slaweq has joined #opendev06:57
ianwyeah it wasn't quite clear if that's all the same thing, but i agree the best chance of finding something is probably on the webmaster console07:02
*** ralonsoh has joined #opendev07:02
*** andrewbonney has joined #opendev07:08
jrosserutilit07:08
*** tosky has joined #opendev07:26
*** rpittau|afk is now known as rpittau07:27
*** DSpider has joined #opendev07:32
cgoncalvesseeing this in multiple job builds: "Immediate connect fail for 2604:e100:3:0:f816:3eff:fe6b:ad62: Network is unreachable"07:46
cgoncalveshttps://api.us-east.open-edge.io:8080/swift/v1/AUTH_e02c11e4e2c24efc98022353c88ab506/zuul_opendev_logs_5f5/755777/10/check/octavia-v2-act-stdby-dsvm-scenario-stable-train/5f506d8/controller/logs/dib-build/amphora-x64-haproxy.qcow2_log.txt07:47
ianwthat one was in ovh gra108:03
ianwwhich i don't believe has ipv608:04
ianwmore likely an issue with 38.108.68.124 ? (what is that?)08:04
ianwhuh opendev.org :)08:04
ianwok, it's a curl to get requirements08:05
ianwworks for me, but possible one backend is having issues and i'm not hashed to it08:06
*** fressi has joined #opendev08:07
AJaegerwhy is it usinrg curl for requirements? It should use the local download instead...08:14
*** qchris_ has quit IRC08:16
*** qchris has joined #opendev08:17
*** priteau has joined #opendev08:33
fricklerI can confirm that opendev.org isn't reachable via IPv6 from my site either. I guess mnaser or some other vexxhost support  will have to take a look08:44
fricklercan't seem to log into gitea-lb01.opendev.org via v4 either, maybe there's more broken. but for v6 I'm at once getting destination unreachable from my local ISP08:46
*** kopecmartin has joined #opendev08:54
kopecmartinhi, i created a new repo under x org by https://review.opendev.org/#/c/753773/ and i'd need to edit the group access now (maybe i should have done it within the review directly) .. i'd like to be added to ansible-role-refstack-client-core group and i'd like to include refstack-core group within the ansible-role-refstack-client-core one as well08:58
kopecmartincan anyone help please?08:58
*** piequi has joined #opendev09:09
*** piequi has left #opendev09:10
*** piequi has quit IRC09:10
fricklerkopecmartin: sure, added you to ansible-role-refstack-client-core and -release, you should be able to make all other changes yourself. not sure about the ansible-role-refstack-client-ci group, though, likely we would make -core the owner of that one?09:11
*** _marc-antoine_ has joined #opendev09:12
kopecmartinfrickler: thank you .. sure, let's make -core the owner of -ci group09:12
*** mkalcok has joined #opendev09:12
*** chrome0 has joined #opendev09:13
fricklerkopecmartin: done09:15
kopecmartinfrickler: thank you09:15
rpittaugood morning everyone, any issue going on right now on opendev ?09:16
rpittauI'm seeing this when trying to access: Failed to connect to opendev.org port 443: Connection timed out09:17
*** hashar has quit IRC09:17
rpittaugot some reports from other people as well09:17
fricklerrpittau: yes, seems there are issues with IPv6 connectivity, forcing v4 might be a workaround for now. we likely need help from vexxhost to resolve09:18
rpittauok, thanks frickler09:18
chrome0I seem to have issues with ipv4 as well? https://paste.ubuntu.com/p/yhmWzmKDPj/09:20
*** kleini has joined #opendev09:20
fricklerhmm, yes, maybe the issue isn't v6 after all. via a different provider I could log reach gitea-lb01.opendev.org (via v6) and everything looks fine there afaict09:30
openstackgerritMerged openstack/diskimage-builder master: Ensure yum-utils is installed in epel element  https://review.opendev.org/75601009:56
donnydianw probably certs10:01
donnydianw: checking now10:02
chrome0fwiw opendev.org ipv4 recovered here "Connection to 38.108.68.124 80 port [tcp/http] succeeded!"10:09
*** priteau has quit IRC10:30
fricklerinfra-root: lots of error node launch attempts on ovh starting at 0600 according to grafana, nl04 logs look inconclusive to me10:36
donnydykarel:  try again with https://api.us-east.open-edge.io:8080/swift/v1/AUTH_e02c11e4e2c24efc98022353c88ab506/zuul_opendev_logs_c18/757605/1/check/tripleo-ci-centos-8-scenario003-standalone/c1864e3/10:38
donnydI think the certs were up for renewal today10:39
ykareldonnyd, same result10:39
ykarelDeceptive site ahead10:39
donnydthat's interesting10:40
donnydI just renewed the cert, so it's surely not that10:40
donnydit's possible that your browser is complaining because the service is on 8080 and its tls10:42
ykareldon't know what's it, happening for me on both chrome/firefox10:44
ykareldonnyd, for you ^ url working fine?10:45
donnydyea it works fine here10:45
ykareldonnyd, and before renewing certs you got same error?10:45
donnydi got no errors before either10:46
donnydbut that doesn't mean anything10:46
donnydwhat does your terminal say10:46
donnydcurl https://api.us-east.open-edge.io:808010:46
ykarel<?xml version="1.0" encoding="UTF-8"?><ListAllMyBucketsResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><Owner><ID>anonymous</ID><DisplayName></DisplayName></Owner><Buckets></Buckets></ListAllMyBucketsResult>10:46
donnydtry this link - https://api.us-east.open-edge.io:8443/swift/v1/AUTH_e02c11e4e2c24efc98022353c88ab506/zuul_opendev_logs_c18/757605/1/check/tripleo-ci-centos-8-scenario003-standalone/c1864e3/10:49
ykarel^ returns same Deceptive site ahead10:49
donnydhttps://transparencyreport.google.com/safe-browsing/search?url=https:%2F%2Fapi.us-east.open-edge.io10:52
donnydit would appear that google doesn't like me10:52
ykareldonnyd, with ^ too same10:53
ykareldonnyd, i just checked from https://www.proxysite.com/ and from there it's working by selecting EU or US server10:53
ykarelso something to do with source request, i am from India10:54
donnydyea I am thinking there is probably some hoops they want me to jump through10:54
donnydok, I think I have it figured out11:01
donnydWe are working on it11:01
donnydykarel: no, it's something else. We should have this fixed up soon. There is no reason for concern11:03
donnydI appreciate you bringing it up though11:04
ykareldonnyd, Thanks, let us know once fixed11:06
openstackgerritSagi Shnaidman proposed zuul/zuul-jobs master: Install openswitch and firewall if need a bridge only  https://review.opendev.org/75783111:26
*** dviroel has joined #opendev11:26
*** lpetrut has joined #opendev11:37
_marc-antoine_opendev.org is working again, well done guys !11:38
*** marios has joined #opendev11:38
*** eolivare has quit IRC11:41
*** eolivare has joined #opendev11:42
lourotreview.opendev.org is down now though11:53
iceyand now review.opendev.org is down?11:53
icey(dang lourot beat me to it)11:53
mariosyah same lourot icey11:53
mariosicey: lourot: but i can work from git cli and git review stuff , just gerrit web is down looks like11:54
sshnaidmdown11:54
sshnaidmProxy Error11:54
iurygregorysame for me11:54
*** ysandeep is now known as ysandeep|brb11:56
ykarelsame for me too12:00
redrobotπŸ”₯πŸ”₯🐢πŸ”₯πŸ”₯12:03
*** ysandeep|brb is now known as ysandeep12:08
fungidonnyd: it's not just the swift service, it looks like firefox has decided the entire open-edge.io domain is suspect... https://open-edge.io/ gets me the same warning12:09
fungilourot: icey: marios: sshnaidm: iurygregory: ykarel: i'm looking into it now, guessing the gerrit service has stopped abruptly12:10
fungimm, no, it's running...12:11
mariosthank you fungi12:11
sshnaidmfungi, same about https://open-edge.io/ , it's in blacklist of Google Safe Browsing. I reported there it's a good site, need just more people to do it I belive12:11
bolgi am also getting: Proxy Error for review.opendev.org12:12
fungisshnaidm: yep, i did too before i went to bed12:12
sshnaidmfungi, maybe worth to send to discuss list so more people can report it as a good site12:13
fungibolg: yes, it seems the apache service on the server isn't getting a timely response from the java service to which it's proxying connections, i'm trying to establish why12:13
bolgfungi: thanks! (y)12:13
fungilots of errors in its log like:12:14
fungi[2020-10-13 12:13:47,006] [HTTP-66-selector-ServerConnectorManager@54759210/0] WARN  org.eclipse.jetty.util.thread.QueuedThreadPool : HTTP{STARTED,20<=20<=100,i=18,q=200} rejected org.eclipse.jetty.io.AbstractConnection$2@55a5c5ba12:14
fungi[2020-10-13 12:13:47,006] [HTTP-66-selector-ServerConnectorManager@54759210/0] WARN  org.eclipse.jetty.io.SelectorManager : Could not process key for channel java.nio.channels.SocketChannel[connected local=/127.0.0.1:8081 remote=/127.0.0.1:47906]12:14
fungilooks like maybe thread contention for socket handling?12:15
fungidisk utilization on root seems to have started climbing rapidly at ~11:10 utc, but that's likely just from logs filling up with error messages12:16
fungino other obvious signs anything is out of the ordinary resource wise (spike in established tcp connections, but that's a likely symptom of it not responding for a bit)12:17
fungithere is a java process which seems fairly busy given nobody can reach the server12:17
fungii'm going to try restarting the container12:18
fungiand done, though it will likely take a couple minutes to start up fully12:19
Tenguthanks fungi12:20
fricklerthere's this just before the socket errors start: [2020-10-13 11:47:17,193] [HTTP-2666233] WARN  org.eclipse.jetty.util.thread.QueuedThreadPool : Unexpected thread death: org.eclipse.jetty.util.thread.QueuedThreadPool$3@140898fb in HTTP{START12:20
fricklerED,20<=20<=100,i=16,q=0}12:20
Tengu(and now everyone is refreshing, making the whole service re-crash ;)) - it seems to be back!12:21
fungi#status log restarted gerrit container on review.opendev.org after it stopped responding to apache12:21
openstackstatusfungi: finished logging12:21
fungi11:47:17 is definitely also just prior to anyone asking questions about it in here, so certainly seems suspicious12:22
fungistart looking into the routing or outage issues for our gitea service next12:23
*** sboyron has joined #opendev12:24
*** _marc-antoine_ has quit IRC12:25
mariosthanks fungi looks like it's back?12:31
lourotalso back for me, thanks!12:33
fungimarios: gerrit? yes, i restarted the service, looks like a thread got itself wedged somehow12:33
fungifor the gitea connectivity issues, i'll have to dig deeper, but those are almost certainly unrelated12:34
mariosthanks12:39
ykarelTanks fungi12:44
fungialso juggling virtual booth duty at ansiblefest right now, so had to switch focus to make sure i was all logged in and stuff, but looking into gitea now12:53
donnydfungi: thanks for the heads up, we are working it12:59
*** openstackgerrit has quit IRC13:17
fungi#status log restarted gerritbot on eavesdrop.o.o and germqtt on firehose.o.o following gerrit outage13:19
openstackstatusfungi: finished logging13:19
fungii'm able to directly clone repositories from all 8 gitea backends, so it doesn't seem any of them is down hard13:29
*** moguimar has joined #opendev13:39
bolgThanks fungi13:42
fungicacti graphs for the gitea backends don't indicate any obvious problems when folks were reporting connection timeouts, so i'm leaning increasingly toward assuming it was a temporary internet connectivity issue somewhere13:44
*** fressi has quit IRC13:57
clarkbfungi: catching up quickly before dialing into the board meeting and figuring out ansiblefest, it sounds like gerrit's java process was sad and restarted and separately there was a gitea issue? sounds like ipv6 connectivity issues to gitea13:59
clarkbanything else I should try and get up to speed with?13:59
*** mlavalle has joined #opendev13:59
clarkbre openedge phishing, we do host pre built static websites there I wonder if google's indexer has found that and has decided its bad13:59
clarkbfor example zuul-ci.org contents will be available there in test builds but the domain will be openedge14:00
funginothing else i can think of which is on fire, no14:00
fungialso i think i've gotten the last of the mirror volumes caught back up, but still need to double-check their timestamps14:00
*** roman_g has joined #opendev14:02
clarkbmy replication from review-test to test gitea is still running with ~1100 repos to go14:03
fungilooks like mirror.ubuntu-ports needs help, but the rest are fine now. i'll see what it's problem is in a moment14:05
clarkbthanks!14:05
clarkbfungi: do you think we need to do more followup with gerrit, gitea, or openedge?14:05
fungiclarkb: not at the moment probably14:05
fungiunless someone wants to dig into the gerrit error frickler found14:06
clarkb"unexpected thread death" ?14:07
clarkbmy initial hunch is we're going to discover its a jetty bug and yet another reason to upgrade :)14:07
fungithat's my gut feel, yeah14:11
*** redrobot has quit IRC14:22
*** ysandeep is now known as ysandeep|away14:25
clarkbis it possible the db outage didn't happen until just now? that may also explain it14:27
clarkbthe friday window was super quiet at least during the period of it I managed to stay away14:28
fungiyeah, if the outage happened today then they didn't tell us14:28
clarkbI wonder if we can check the db uptime somehow? mysql may expose that?14:28
clarkbjust to rule that out14:29
*** nuclearg1 has joined #opendev14:35
fungimirror.ubuntu-ports had a stale vldb lock, so i've taken the flock for its update cronjob on mirror-update.openstack.org after the last run, unlocked the volume manually and started a vos release -localauth in a root screen session on afs01.dfw14:41
*** lpetrut has quit IRC14:51
*** fressi has joined #opendev14:55
*** fressi has quit IRC15:04
*** smcginnis has quit IRC15:17
*** ykarel is now known as ykarel|away15:18
*** smcginnis has joined #opendev15:30
*** mkalcok has quit IRC15:45
*** ykarel|away has quit IRC15:50
fungilooking at http://grafana.openstack.org/d/ACtl1JSmz/afs?orgId=1 mirror.ubuntu seems to be slightly over quota and mirror.ubuntu-ports is very close15:59
fungias is mirror.yum-puppetlabs15:59
clarkbyou should be able to safely bump up ubuntu quota16:00
clarkbwe've cleaned up a lot of the old suse and fedora stuff recently so should have plenty of room16:00
fungimirror.ubuntu may not be over quota yet, but it's at least nearly there16:01
clarkbwe may also be able to trim the ubunut-ports mirror if people aren't testing on older images there?16:04
clarkbthat seems very likely given how we've relied on newer kernels than distros have provided by default in the past16:04
*** rpittau is now known as rpittau|afk16:21
*** SotK has quit IRC16:27
*** SotK has joined #opendev16:29
*** rpittau|afk has quit IRC16:31
*** marios is now known as marios|out16:32
*** ShadowJonathan has quit IRC16:33
*** rpittau|afk has joined #opendev16:34
*** ShadowJonathan has joined #opendev16:35
*** priteau has joined #opendev16:39
*** marios|out has quit IRC16:43
*** hamalq has joined #opendev16:44
clarkbdown to 1100 replication tasks remaining16:53
*** ykarel|away has joined #opendev17:01
mwhahahahey is there any plan to upgrade gerrit at least to a newer 2.x version?17:10
clarkbmwhahaha: http://lists.opendev.org/pipermail/service-discuss/2020-October/000103.html17:11
clarkbwe haven't scheduled an outage window yet, but have what appears to be a working upgrade process all the way to 3.217:11
mwhahahanice17:12
fungihope to talk about the schedule in the opendev meeting later today17:12
mwhahahai ask cause Emilien was hitting problems with F33's ssh crypto policies and the current gerrit version17:12
clarkbstill doing testing. Working on replication to gitea as we speak. Need to test project creation and renaming after that. We also know that jeepyb launchpad integration will break because the database is going away17:12
fungimwhahaha: yep, i have him a hopefully less intrusive workaround in #openstack-infra a few minutes ago17:12
clarkbwith the release this week, summit next, ptg after, then elections after that I doubt it happens before middle of novemeber17:12
fungielections?17:13
mwhahahaah nice17:13
clarkbmwhahaha: also https is an option if people would prefer not to modify ssh configs (though they can be modified on a per host basis so doesn't seem like a major deal)17:13
mwhahahayou can target just review. int he ssh config anyway17:13
clarkbmwhahaha: I've also been encouraging people that are interested to check out https://review-test.opendev.org which is an upgraded snapshot of production from october 117:14
mwhahahai was asking because i was thinking about the ed keys which was added in 2.14 it hink17:14
*** ykarel|away has quit IRC17:14
clarkbmwhahaha: in this case its the host key that it is complaining about not the user auth key17:14
mwhahahayes i'm aware17:14
clarkband I think ed keys don't work in gerrit at all? might be ecdsa17:14
clarkb(it generates them for you when you upgrade then complains about them when you start it)17:15
mwhahahaSupport for elliptic curve/ed25519 SSH keys17:15
mwhahahais that just server side?17:15
mwhahahahttps://bugs.chromium.org/p/gerrit/issues/detail?id=450717:15
clarkbmwhahaha: I think they added support for them for user auth but not host keys17:15
mwhahaharight17:15
mwhahahai was talking user auth17:15
clarkbjust calling that out because switching the user auth key type doesn't necessarily fix emilienM's problem. I tested review-test and its host keys should be fine but not sure what version that changed in17:16
mwhahahayea i know it's two different things17:16
mwhahahathanks for clarifying tho17:16
clarkbI think my favorite new feature in gerrit is the ability to set a user status. review-test thinks I'm partying17:18
mwhahahaha17:18
fungijust need a tiki icon to go with it17:18
clarkbthere are actually a lot of properly useful features that I'll be happy to see like single page diffs for all files and better dashboarding17:19
clarkbbut we know it isn't perfect, we'll likely try and fix integrations and other things people notice after we upgrade just so that we actually get the upgrade done17:19
clarkbone ui nit I've noticed recently is it puts the change approval, rebase, edit, abandon buttons below the user area and if you click on the user area it does a fly out thing that will cover those other commands17:20
clarkbhave to carefully click to avoid abandongin17:21
*** Guest75569 has joined #opendev17:21
* fungi abandons his changes and goes partying17:23
*** Guest75569 is now known as redrobot17:23
clarkbfungi: re election I expect people will be very distracted on top of normal ptg hangover state17:25
clarkbI mean maybe that is a good time for us to upgrade gerrit if we can get over our own hangovers :)17:26
fungiahh, not openstack elections17:26
fungiwell, you're probably right that even people outside the usa will be distracted by whatever results from the usa elections17:27
*** nuclearg1 has quit IRC17:28
*** andrewbonney has quit IRC17:28
*** eolivare has quit IRC17:30
clarkbthe major struggle in testing this is we have what I'm beginning to think of as data rot. We've accumulated so much data in gerrit over 8ish years that testing is slow. I did a lot of testing locally to check out simple things but eventually its just easier with better coverage confidence to test with the prod snapshot17:32
clarkbI've noticed that nova has grown beyond 1GB recently too17:33
clarkbits just over 1GB now17:33
fungithat's after aggressive gc right?17:35
clarkbyup17:35
clarkbI used nova as a git clone test from gerrit17:35
clarkband it came out to like 1.01GB17:35
clarkbs/gerrit/review-test/17:35
fungitime to declare nova feature complete17:35
clarkbthats without the extra refs too17:36
clarkbthats what you get if you clone it normally to dev on it17:36
*** roman_g has quit IRC17:52
*** priteau has quit IRC17:53
*** priteau has joined #opendev17:53
*** ralonsoh has quit IRC18:00
*** priteau has quit IRC18:01
*** sshnaidm is now known as sshnaidm|afk18:24
*** sboyron has quit IRC18:44
fungiubuntu-ports volume was released successfully and then i ran another mirror update manually just to make sure it's working correctly18:49
fungii'll take a look at quotas here momentarily18:49
fungiyeah, fs listquota says mirror.ubuntu is 99% and mirror.ubuntu-ports is 93% used18:52
*** hashar has joined #opendev18:57
*** fressi has joined #opendev18:58
*** fressi has quit IRC18:59
fungi#status log increased mirror.ubuntu afs quota from 550000000 to 650000000 (99%->84% used)19:00
openstackstatusfungi: finished logging19:00
fungi#status log increased mirror.ubuntu-ports afs quota from 500000000 to 550000000 (93%->84% used)19:01
openstackstatusfungi: finished logging19:01
*** diablo_rojo has joined #opendev19:12
*** openstackgerrit has joined #opendev19:17
openstackgerritIan Wienand proposed opendev/system-config master: [wip] reprepro  https://review.opendev.org/75766019:17
openstackgerritIan Wienand proposed opendev/system-config master: [wip] reprepro  https://review.opendev.org/75766019:30
ianwcorvus: not mentioned but there's a little stack @ https://review.opendev.org/#/c/756605/ about capturing container logs which you might like to consider20:01
ianwnot mentioned in meeting sorry20:01
corvusianw: ack, thanks for the heads up will look in a sec20:01
clarkbinfra-root I mentioned it in the meeting too but the stack at https://review.opendev.org/#/c/757162/ is good flavor for the things that will change config wise on the gerrit server as we go through the upgrade20:03
clarkbhttps://etherpad.opendev.org/p/gerrit-2.16-upgrade is a rough rough draft of the mechanical process to implement the upgrade20:04
clarkbI intend on rewriting that to reduce the questions and stick to a concrete plan soon. I think we've largely sorted out that process at this point and now its just verifying the results20:04
clarkbone really neat thing is that if you gc --aggressive before reindexing the reindxing cost drops like a stone20:05
clarkbthat tip from luca is likely to be the major difference between a 2.16 only upgrade and the 3.2 upgrade20:05
openstackgerritIan Wienand proposed opendev/system-config master: [wip] reprepro  https://review.opendev.org/75766020:06
clarkbthe notedb migration does a built in reindex and it doesn't gc first whcih is part of why that step is so slow20:06
clarkbbut that is one of three reindexes so the other two are much quicker20:06
*** hashar has quit IRC20:30
*** Dmitrii-Sh has quit IRC20:40
*** Dmitrii-Sh has joined #opendev20:41
*** iurygregory has quit IRC20:50
*** iurygregory has joined #opendev20:50
openstackgerritIan Wienand proposed opendev/system-config master: [wip] reprepro  https://review.opendev.org/75766020:52
clarkbok ansible fest things are done for the day I'm going to get a bike ride in before it starts raining again20:55
clarkbreplication is still going20:55
corvusianw: +1 on that logging change but i noticed an anomoly; left a comment21:02
ianwcorvus: yeah, so in post we try to "docker logs > " all containers to capture their logs, which goes into the docker directory, but with the change updated containers are logging to /var/log/containers21:12
ianwwe can remove the "docker logs" dump when all containers are logging via syslog21:13
ianwcorvus/clarkb: the other one under that if you have a sec is https://review.opendev.org/#/c/756628/3 to remove the custom rsyslogd bit we install, which i don't think we need any more21:14
corvusianw: but docker logs is the one that's working?21:27
corvusand whatever writes to 'containers' (i assume that's podman?) is failing?21:27
ianwcorvus: yes, in post we do a speculative dump of docker and podman container logs (both, i think) with ignore on, so for jobs with no containers involved its also failing21:29
corvusianw: we'll want to keep the 'docker logs' dump though, since that's still working.  unless you plan to collect /var/logs/containers?21:31
corvusianw: i got that backwards didn't i?21:32
corvusianw: it's 'docker' that's failing and containers that's working21:32
ianwcorvus: yep; when containers are directing their logs to syslog (which we collect and save in /var/log/containers), "docker logs" on them shows up that "the logs aren't here" message21:33
corvusianw: because you did add an entry to collect /var/log/containers21:33
ianwright, basically i'd like to convert everything so container logs are in "/var/log/containers/docker-<sensible-name>"21:34
corvusianw: okay, sorry i messed that up.  i agree that change looks good and removing the docker collection is fine.  one more question though: that means that "docker logs foo" isn't going to work for us on the real hosts either.  are we okay with that?  (i'm assuming we are -- we'll just 'less /var/log/containers' and probably enjoy that more anyway).21:34
ianwright, personally i find "docker logs" quite frustrating compared to just a regular file on disk21:35
corvuswfm.21:35
corvusianw: +2; will let you +w or circulate more as appropriate21:35
ianwi'll update the other compose files as well21:36
fungisome of the same complaints i have with journald21:36
ianwin a follow-on21:36
fungigood ol' log files in /var are hard to beat21:36
ianwthe only problem with syslog is the ridiculous low-precision, english-based timestamp format21:37
fungiyes, i concur21:39
fungiiso-8601 with subseconds would be far better21:40
fungiinterestingly, rfc 5424 (and 3339 before it) mandates iso-8601 for syslog protocol21:47
fungi5424 also allows for microsecond resolution in the timestamp21:48
openstackgerritMerged opendev/system-config master: Remove Ubuntu Xenial ARM64 base testing  https://review.opendev.org/75662721:48
fungiyou can instruct rsyslog to write out timestamps in any format you like21:49
ianwyes, you may notice the esxi syslog puts out logs in this format21:55
ianwi wonder if the latest releases are still using vmsyslgod21:57
ianwlooks like it.  i originally wrote all that in python to replace the busybox syslog collection it had before, with the plan to write it "properly" in C++ or whatever once all the esxcli integration was fleshed out etc.22:00
ianwfungi: if you have any thoughts on how else to get back to the default file in https://review.opendev.org/#/c/756628/ i'm open to suggestions to :)22:03
fungi`apt install --reinstall rsyslog` might do it, `apt purge rsyslog && apt install rsyslog` almost certainly would22:06
openstackgerritIan Wienand proposed opendev/system-config master: [wip] reprepro  https://review.opendev.org/75766022:08
ianwyeah that takes the purge/install approach22:12
ianwright, i'm going to manually pip upgrade jinja on brdige now and try the borg playbook22:12
*** paramite has quit IRC22:23
ianwwell ethercalc02 has applied borg ok, i'm running the backup script manually now and it seems to be backing up22:31
*** diablo_rojo has quit IRC22:34
*** qchris has quit IRC22:41
*** qchris has joined #opendev22:54
fungiyay!22:59
clarkbwoot23:00
* clarkb is now back from the bike ride23:00
openstackgerritIan Wienand proposed opendev/system-config master: borg-backups: add some extra excludes  https://review.opendev.org/75796523:02
clarkbit occured to me on my bike ride I hadn't tested stream events yet. I have now done that, seems to work fine23:02
fungiexcellent23:03
*** mlavalle has quit IRC23:03
clarkbianw: corvus: the problem I've found with docker logs foo is that it prints from the beginning of time which can lead to very long wait times23:03
clarkbeven if you provide a --since flag it has to scroll through without printing first23:04
ianwyeah, same problem journalctl seems to have as well23:04
clarkbianw: for https://review.opendev.org/#/c/756628/3/playbooks/roles/base/server/tasks/Debian.yaml purging won't affect running services right? it should keep running the existing service process then restart it when the reinstall happens?23:05
* clarkb is trying to be slushy :)23:05
clarkbfungi: ^ as someone more in tune with the openstack releas process maybe you want to take a look at that one and decide if it is safe to go in now?23:05
ianwclarkb: yeah, i think there will be a small period of cutover only.  i don't mind waiting a bit on that23:05
clarkbI've +2'd it and if fungi is comfortable with it he can approve now or we can approve tomorrow post release23:06
ianwwe can revert after one run of base23:06
clarkbI'm going to take a second look at the replication plugin this time by rtfs'ing to see if we can perhaps make replication a bit more friendly post notedb migration23:07
clarkbI get the sense that most people use replication to cluster or have a hot standby server and not to take load off the main server23:07
ianwclarkb: if you'd like to play, backup02.ca-ymq-1.vexxhost.opendev.org with borg-ethercalc02 user should be able to be poked at23:08
clarkb(and in those cases you want to replicat everything)23:08
*** slaweq has quit IRC23:08
ianw /opt/borg/bin/borg is the binary23:08
clarkbianw: we're doing that over ssh but without encrpytion at rest right? so no passphrase to sort out?23:09
ianwclarkb: right, no encryption on disk23:10
clarkbianw: I see 3 backups for a total of 1.82 GB23:13
clarkbnow to try a fuse mount and see that the redis dump is there23:14
ianwI got "Warning: The repository at location /opt/backups-202010/borg-ethercalc02/backup was previously located at /opt/backups/borg-ethercalc02/backup" when i looked23:15
clarkb/opt/borg/bin/borg info '::ethercalc02-2020-10-13T22:49:26' is my command no warning23:15
clarkband borg list shows you what the valid ^ things are23:16
clarkbbefore I fuse mount do we have an excluded backup mount point?23:16
clarkbthat may be good so we avoid a feedback loop23:16
clarkb`/opt/borg/bin/borg mount '::ethercalc02-2020-10-13T22:49:26' /mnt` is how I was going to fuse mount but holding off while I check our exclusions23:16
ianwi'm pretty sure it won't cross file-systems23:17
clarkblooks like we don't have a good exclude for that should we call it /root/borg_mnt ?23:17
clarkboh23:17
ianwthe other thing is that it's an include-process, so as long as you don't mount under one of the included dirs too23:17
ianwclarkb: https://review.opendev.org/#/c/757965/1/playbooks/roles/borg-backup/defaults/main.yaml (if you want to review that too :)23:18
clarkboh right we're backing up /etc /home /var and /root23:18
clarkbso /mnt is safe /me mounts and looks then23:19
ianwi think it cached my response to the "repository moved" question23:19
clarkbborg mount not available: loading FUSE support failed [ImportError: No module named 'llfuse']23:19
ianwweird that it doesn't know about symlinks23:19
ianwhrm, we may not have built the pip install with fuse support?23:19
clarkbya it needs the llfuse python package23:20
clarkbI personally think that is worthwhile as the fuse support is one of my favorite borg functionalities23:20
clarkbreally simplifies verification and all that of backups23:20
ianwdo we need it on the hosts or just the server?23:20
clarkbI believe the hosts23:20
clarkbsince its running the fuse on the client side23:21
ianwoh right, to be able to mount on each server23:21
clarkbpip install borgbackup[fuse] is what googling tells me23:21
clarkbfrom what I can see so far it seems happy though23:23
clarkbianw: what command produces that warning?23:23
clarkband where are you running it from?23:23
ianwclarkb: i was running that on the backup server, as the borg user23:23
*** zigo has quit IRC23:23
ianwbut i think i'm seeing that i should reframe things to think about inspecting it on the remote server23:24
clarkbianw: fwiw I get that too because I do offsite backups to my brother's house over a shared ISP connection (so its really fast) but his dynamic addressing changes and it warns me23:24
clarkbit seems to be harmless and is more of a "hey user you may not have expected this so we're warningyou" rather than "functioanlity is degraded"23:24
ianwok, as long as it's not corrupting anything :)23:24
clarkbhrm ya I've always interacted with it in the context of the client side23:24
*** qchris has quit IRC23:25
clarkbbecause my remote is a slow raspi23:25
clarkbwith encryption you really really really want to do that on the faster client side :)23:25
ianwok, i can look at the the fuse bits.  i think we leave it for a few days to test the nightly runs and then we can look at rolling it out23:27
clarkb++23:27
clarkbif the package deps for fuse are bad somehow we can also just document that you need to install it in a different venv or something23:27
clarkbbut ya beats grabbing a complete tarball and then extracting specific bits23:28
openstackgerritIan Wienand proposed opendev/system-config master: [wip] reprepro  https://review.opendev.org/75766023:30
ianwlooks like "libfuse-dev fuse pkg-config" which i guess isn't too bad23:31
*** qchris has joined #opendev23:34
*** tosky has quit IRC23:40
fungisorry, sucked into election activity but can maybe look later23:44

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!