Monday, 2019-03-11

*** armax has quit IRC00:08
*** lathiat has quit IRC00:09
*** lathiat has joined #openstack-infra00:15
*** ricolin has joined #openstack-infra00:59
*** Haunted330 has joined #openstack-infra01:04
*** wolverineav has quit IRC01:11
*** Haunted330 has quit IRC01:14
*** Haunted330 has joined #openstack-infra01:14
*** Haunted330 has quit IRC01:30
*** wolverineav has joined #openstack-infra01:35
*** jamesmcarthur has joined #openstack-infra01:41
openstackgerritGhanshyam Mann proposed openstack-infra/irc-meetings master: Modify the QA office hour time  https://review.openstack.org/64230801:42
*** jamesmcarthur has quit IRC01:48
openstackgerritTristan Cacqueray proposed openstack-infra/nodepool master: Implement an OpenShift Pod provider  https://review.openstack.org/59033501:52
*** hongbin has joined #openstack-infra02:02
*** wolverineav has quit IRC02:27
openstackgerritTristan Cacqueray proposed openstack-infra/zuul master: web: add tenant and project scoped, JWT-protected actions  https://review.openstack.org/57690702:32
openstackgerritIan Wienand proposed openstack-infra/system-config master: [dmn]  https://review.openstack.org/64231402:54
openstackgerritIan Wienand proposed openstack-infra/system-config master: [dmn] stashing some scripts to make git:// -> https:// changes  https://review.openstack.org/64231402:54
*** whoami-rajat has joined #openstack-infra03:07
*** ykarel has joined #openstack-infra03:22
*** wolverineav has joined #openstack-infra03:36
*** jamesmcarthur has joined #openstack-infra03:39
*** wolverineav has quit IRC03:40
openstackgerritIan Wienand proposed openstack-infra/system-config master: [dmn] stashing some scripts to make git:// -> https:// changes  https://review.openstack.org/64231403:43
*** wolverineav has joined #openstack-infra03:51
*** hongbin has quit IRC03:58
*** udesale has joined #openstack-infra04:03
*** wolverineav has quit IRC04:07
openstackgerritRiju Khatri proposed openstack-infra/storyboard-webclient master: Add nofollow attribute to hyperlinks  https://review.openstack.org/64232704:19
*** wolverineav has joined #openstack-infra04:24
*** jamesmcarthur has quit IRC04:32
*** yamamoto has joined #openstack-infra04:32
*** stakeda has joined #openstack-infra04:37
*** ramishra has joined #openstack-infra04:45
*** janki has joined #openstack-infra04:46
openstackgerritTrinh Nguyen proposed openstack-infra/system-config master: Add meetbot to openstack-fenix channel  https://review.openstack.org/64234005:18
*** jaosorior has joined #openstack-infra05:52
openstackgerritIan Wienand proposed openstack-infra/zuul-sphinx master: Add type to role variables  https://review.openstack.org/64116805:53
*** apetrich has joined #openstack-infra06:11
*** ianychoi has quit IRC06:32
*** ianychoi has joined #openstack-infra06:32
*** ianychoi has quit IRC06:35
*** ianychoi has joined #openstack-infra06:36
*** jtomasek has joined #openstack-infra06:50
*** jbadiapa has joined #openstack-infra06:51
*** e0ne has joined #openstack-infra06:51
*** pcaruana has joined #openstack-infra07:00
*** e0ne has quit IRC07:02
*** rcernin has quit IRC07:03
openstackgerritRiju Khatri proposed openstack-infra/storyboard-webclient master: Show all stories created  https://review.openstack.org/64237007:04
*** kopecmartin has joined #openstack-infra07:06
*** slaweq has joined #openstack-infra07:34
openstackgerritAnkita Bansal proposed openstack-infra/storyboard-webclient master: allow subscriptions to projects when items in project groups list are expanded Story: 2000545 Task: 2911  https://review.openstack.org/64237107:35
*** mpjetta has quit IRC07:36
openstackgerritTristan Cacqueray proposed openstack-infra/nodepool master: Add python-path option to node  https://review.openstack.org/63733807:37
openstackgerritTristan Cacqueray proposed openstack-infra/nodepool master: Implement an OpenShift Pod provider  https://review.openstack.org/59033507:37
*** mpjetta has joined #openstack-infra07:40
openstackgerritTristan Cacqueray proposed openstack-infra/nodepool master: Implement a Runc driver  https://review.openstack.org/53555607:41
*** rascasoft has joined #openstack-infra07:44
*** pgaxatte has joined #openstack-infra08:01
*** ginopc has joined #openstack-infra08:02
*** rpittau|afk is now known as rpittau08:09
*** xek has joined #openstack-infra08:14
*** tkajinam has quit IRC08:17
*** gouthamr has quit IRC08:18
*** stevebaker has quit IRC08:19
*** dmellado has quit IRC08:20
*** helenafm has joined #openstack-infra08:22
*** e0ne has joined #openstack-infra08:27
*** hwoarang has quit IRC08:34
*** hwoarang has joined #openstack-infra08:36
*** dtantsur|afk is now known as dtantsur08:38
*** tosky has joined #openstack-infra08:41
*** roman_g has joined #openstack-infra08:47
*** iurygregory has joined #openstack-infra08:50
*** zbr has joined #openstack-infra08:52
openstackgerritMerged openstack-infra/irc-meetings master: Modify the QA office hour time  https://review.openstack.org/64230808:55
*** jpena|off is now known as jpena08:56
*** wolverineav has quit IRC08:56
*** jpich has joined #openstack-infra08:57
*** jpich has quit IRC08:57
*** adrianreza has quit IRC09:00
*** adrianreza has joined #openstack-infra09:00
*** needssleep has quit IRC09:01
*** jpich has joined #openstack-infra09:02
*** ykarel has quit IRC09:10
*** hwoarang has quit IRC09:11
*** ykarel has joined #openstack-infra09:11
*** hwoarang has joined #openstack-infra09:12
openstackgerritMerged openstack/diskimage-builder master: [lvm] Add Ubuntu bionic as supported distro  https://review.openstack.org/64085009:19
*** e0ne has quit IRC09:22
*** owalsh_ is now known as owalsh09:22
*** noama has joined #openstack-infra09:28
*** janki has quit IRC09:30
*** jchhatbar has joined #openstack-infra09:30
*** ykarel is now known as ykarel|lunch09:37
*** e0ne has joined #openstack-infra09:42
*** derekh has joined #openstack-infra09:43
*** panda is now known as panda|rover09:50
*** wolverineav has joined #openstack-infra09:57
*** wolverineav has quit IRC10:01
*** priteau has joined #openstack-infra10:03
*** ykarel|lunch is now known as ykarel10:07
dulekI'm trying to setup CoreDNS for K8s clusters deployed by DevStack in kuryr-kubernetes gates.10:13
dulekIs there any chance to learn upstream DNS server address in DevStack?10:13
openstackgerritAdam Coldrick proposed openstack-infra/storyboard-webclient master: Show tags with stories in project view.  https://review.openstack.org/64223010:13
dulekObviously 127.0.0.1 from /etc/resolv.conf will not work for me. :D10:13
*** yamamoto has quit IRC10:17
*** stakeda has quit IRC10:26
openstackgerritMatthieu Huin proposed openstack-infra/zuul master: web: add tenant and project scoped, JWT-protected actions  https://review.openstack.org/57690710:26
openstackgerritMatthieu Huin proposed openstack-infra/zuul master: [WIP] Allow operator to generate auth tokens through the CLI  https://review.openstack.org/63619710:26
openstackgerritMatthieu Huin proposed openstack-infra/zuul master: [WIP] Zuul CLI: allow access via REST  https://review.openstack.org/63631510:26
openstackgerritMatthieu Huin proposed openstack-infra/zuul master: [WIP] Add Authorization Rules configuration  https://review.openstack.org/63985510:26
openstackgerritMatthieu Huin proposed openstack-infra/zuul master: [WIP] Web: plug the authorization engine  https://review.openstack.org/64088410:27
openstackgerritMatthieu Huin proposed openstack-infra/zuul master: Zuul Web: add /api/user/actions endpoint  https://review.openstack.org/64109910:27
openstackgerritMatthieu Huin proposed openstack-infra/zuul master: authentication config: add optional token_expiry  https://review.openstack.org/64240810:27
*** luizbag has joined #openstack-infra10:32
*** udesale has quit IRC10:34
*** gfidente has joined #openstack-infra10:37
*** electrofelix has joined #openstack-infra10:40
*** e0ne has quit IRC10:43
*** yamamoto has joined #openstack-infra10:46
*** e0ne has joined #openstack-infra10:47
*** jbadiapa has quit IRC10:52
*** yamamoto has quit IRC10:52
*** jbadiapa has joined #openstack-infra10:52
*** jbadiapa has quit IRC10:54
*** jbadiapa has joined #openstack-infra10:54
*** yamamoto has joined #openstack-infra10:56
*** ricolin has quit IRC10:59
*** gouthamr has joined #openstack-infra10:59
openstackgerritMatthieu Huin proposed openstack-infra/zuul master: web: add tenant and project scoped, JWT-protected actions  https://review.openstack.org/57690711:03
openstackgerritMatthieu Huin proposed openstack-infra/zuul master: [WIP] Allow operator to generate auth tokens through the CLI  https://review.openstack.org/63619711:03
openstackgerritMatthieu Huin proposed openstack-infra/zuul master: [WIP] Zuul CLI: allow access via REST  https://review.openstack.org/63631511:04
openstackgerritMatthieu Huin proposed openstack-infra/zuul master: [WIP] Add Authorization Rules configuration  https://review.openstack.org/63985511:04
*** dmellado_ has joined #openstack-infra11:04
openstackgerritMatthieu Huin proposed openstack-infra/zuul master: [WIP] Web: plug the authorization engine  https://review.openstack.org/64088411:04
openstackgerritMatthieu Huin proposed openstack-infra/zuul master: Zuul Web: add /api/user/actions endpoint  https://review.openstack.org/64109911:04
*** dmellado_ is now known as dmellado11:05
*** dave-mccowan has joined #openstack-infra11:19
*** ykarel_ has joined #openstack-infra11:22
*** stevebaker has joined #openstack-infra11:22
*** ykarel has quit IRC11:24
*** hwoarang has quit IRC11:27
*** hwoarang has joined #openstack-infra11:35
*** e0ne has quit IRC11:37
openstackgerritJakub Bielecki proposed openstack-infra/zuul-preview master: add basic description into README.rst  https://review.openstack.org/64242811:41
*** dave-mccowan has quit IRC11:42
openstackgerritMatthieu Huin proposed openstack-infra/zuul master: authentication config: add optional token_expiry  https://review.openstack.org/64240811:43
*** edmondsw has joined #openstack-infra11:47
*** ykarel_ is now known as ykarel11:49
*** rlandy has joined #openstack-infra11:57
*** jchhatbar has quit IRC11:58
*** rh-jelabarre has joined #openstack-infra12:02
*** aojea has joined #openstack-infra12:08
*** markvoelker has quit IRC12:14
*** panda|rover is now known as panda|rover|lunc12:21
*** e0ne has joined #openstack-infra12:30
*** ginopc has quit IRC12:35
*** janki has joined #openstack-infra12:35
*** trown|back11mar is now known as trown12:46
*** jamesmcarthur has joined #openstack-infra12:48
*** jpena is now known as jpena|lunch12:57
*** hwoarang has quit IRC13:00
*** kgiusti has joined #openstack-infra13:01
*** pgaxatte has quit IRC13:01
*** pgaxatte has joined #openstack-infra13:03
*** hwoarang has joined #openstack-infra13:04
dmsimardAnyone know if spam on freenode is still fairly problematic ? The nickserv auth is still a problem for many IRC clients that attempt to join channels before authenticating.13:08
smcginnisdmsimard: I just saw something recently (beginning of last week?) that there was yet another wave of spam bots hitting freenode server.13:11
*** jcoufal has joined #openstack-infra13:12
smcginnisI think the auth requirement is here to stay, unfortunately. Just too easy to get hit if it's left open.13:12
dmsimardsmcginnis: yeah, I was suspecting as much but I was holding onto the hope that it wasn't so :(13:14
toskysmcginnis: just a note with my KDE hat on - we reopened various channels because we are promoting the usage of the matrix bridge13:14
toskyso far nothing happened (after a week)13:14
toskyI've seen spam on few OFTC channels, though13:14
smcginnistosky: Oh? How's the matrix bridge working out?13:16
toskysmcginnis: there are some delays from time to time, but it's improving; they gave us a server: https://dot.kde.org/2019/02/20/kde-adding-matrix-its-im-framework13:17
smcginnisI did use the matrix mobile client for awhile to try it out and for the most part I liked it. I did notice some delays (this was at least 6 months ago), but it worked.13:18
smcginnisCool to see a larger effort to use that.13:18
toskyI think someone here was looking into matrix - it would be interesting to have an opendev matrix server federated with the rest of the system13:18
toskythey are also rewriting riot (the client) and everyone seems very excited about this, but I didn't try the snapshots yet13:18
*** panda|rover|lunc is now known as panda|rover13:23
*** yamamoto has quit IRC13:24
*** ianychoi has quit IRC13:28
*** ianychoi has joined #openstack-infra13:29
*** jamesmcarthur has quit IRC13:30
*** jamesmcarthur has joined #openstack-infra13:31
*** wolverineav has joined #openstack-infra13:33
*** jamesmcarthur has quit IRC13:36
*** wolverineav has quit IRC13:37
*** mriedem has joined #openstack-infra13:39
*** yamamoto has joined #openstack-infra13:43
*** beekneemech is now known as bnemec13:43
*** jamesmcarthur has joined #openstack-infra13:45
*** jamesmcarthur_ has joined #openstack-infra13:49
*** jrist has quit IRC13:49
pabelangerdmsimard: yah, proxy also had issue over the weekend13:50
*** otherwiseguy has quit IRC13:51
*** sthussey has joined #openstack-infra13:52
*** jrist has joined #openstack-infra13:53
*** jamesmcarthur has quit IRC13:53
*** otherwiseguy has joined #openstack-infra13:53
smcginnisreview.o.o down?13:55
rpittauit seems so13:55
rpittau:/13:55
smcginnisBut it's not even a deadline week. :)13:56
openstackgerritMerged openstack-infra/elastic-recheck master: Add query for nova functional test bug 1819374  https://review.openstack.org/64229213:56
openstackbug 1819374 in OpenStack Compute (nova) "test_interface_detach_with_port_with_bandwidth_request intermittently fails" [Medium,In progress] https://launchpad.net/bugs/1819374 - Assigned to Balazs Gibizer (balazs-gibizer)13:56
slaweqworking again :)13:56
slaweqat least for me13:56
smcginnisOh, yep!13:56
*** efried1 has joined #openstack-infra13:56
*** agopi has joined #openstack-infra13:56
*** efried has quit IRC13:57
*** efried1 is now known as efried13:57
*** jpena|lunch is now known as jpena13:59
mordredsmcginnis, tosky: I was looking at matrix a bit and was pretty pleased with it, and also with the IRC bridge. I think it's a worthwhile thing to talk about again in denver14:03
smcginnismordred: I think performance/lag was the big issue brought up in the past, but hosting our own matrix servers was an option to maybe address that.14:04
smcginnisI think it would definitely be worth talking about more in Denver.14:04
mordredyeah. there also seems to be a wechat bridge, altough I haven't yet investigated how good it is14:05
fungidmsimard: we did have one wander into this channel over the weekend even with the nick registration requirement in place14:06
dmsimardmordred: fwiw I wrote a rudimentary bridge for ara that (probably) supports a bunch of protocols14:07
dmsimardtwo errbot instances, one on protocol A, another on protocol B -- and they relay messages to each other14:07
dmsimardso any backend supported by errbot is supported, in theory14:07
mordrednod.14:07
fungitosky: the last time we tried removing registration as a requirement from our channels (a few months ago) we made it a week or two before the spammers found us and started ramping back up across various channels again, so finally had to put it back in place14:07
dmsimardfungi: yeah I think I saw that14:08
*** ginopc has joined #openstack-infra14:11
*** e0ne has quit IRC14:14
*** dave-mccowan has joined #openstack-infra14:19
*** armax has joined #openstack-infra14:21
*** e0ne has joined #openstack-infra14:23
dmelladodmsimard: o/14:30
dmelladowho's spamming you? xD14:30
dmsimard:(14:32
dmelladotosky: matrix bridge?14:33
*** e0ne has quit IRC14:36
*** e0ne has joined #openstack-infra14:38
*** priteau has quit IRC14:44
openstackgerritAnkita Bansal proposed openstack-infra/storyboard-webclient master: project_group view: add number of active stories beside repo list  https://review.openstack.org/64221114:46
*** FlorianFa has joined #openstack-infra14:50
*** FlorianFa has quit IRC14:52
*** FlorianFa has joined #openstack-infra14:52
*** udesale has joined #openstack-infra14:56
*** priteau has joined #openstack-infra14:59
*** dpawlik has quit IRC15:02
*** hwoarang has quit IRC15:05
*** hwoarang has joined #openstack-infra15:07
*** ginux has joined #openstack-infra15:15
*** ginopc has quit IRC15:15
*** ginux is now known as ginopc15:15
*** yamamoto has quit IRC15:21
iurygregoryMorning everyone, quick question does anyone have an idea why zuul is saying that the job is not defined in https://review.openstack.org/#/c/642474 ironicclient-functional is defined in https://review.openstack.org/#/c/642474/4/zuul.d/ironicclient-jobs.yaml15:26
*** roman_g has quit IRC15:26
*** ykarel is now known as ykarel|afk15:28
*** yamamoto has joined #openstack-infra15:29
AJaegeriurygregory: commented - the error message might be wrong, best doublecheck that everything is valid yaml15:29
iurygregoryAJaeger, do you know any tool that i could use to check? =)15:30
iurygregoryoh ty for the comment =)15:30
AJaegernot sure whether that is really the problem - worth a try...15:32
*** janki has quit IRC15:33
openstackgerritRiju Khatri proposed openstack-infra/storyboard-webclient master: Show all stories created  https://review.openstack.org/64237015:36
*** jamesmcarthur_ has quit IRC15:38
*** jamesmcarthur has joined #openstack-infra15:38
*** agopi is now known as agopi|brb15:39
*** yamamoto has quit IRC15:44
*** yamamoto has joined #openstack-infra15:45
openstackgerritHelena proposed openstack-infra/project-config master: Adding zuul jobs for rsd-virt-for-nova repo  https://review.openstack.org/64250015:45
mordredAJaeger: feel like a +3 on https://review.openstack.org/#/c/632532/ ?15:49
*** weshay68802228 is now known as weshay15:53
*** jamesmcarthur has quit IRC15:57
*** jamesmcarthur has joined #openstack-infra15:57
*** yamamoto has quit IRC15:59
*** jamesmcarthur has quit IRC16:00
AJaegermordred: that's system-config and I don't have the power to +3...16:00
*** yamamoto has joined #openstack-infra16:00
mordredAJaeger: oh - so it is!16:00
mordredAJaeger: I forget sometimes that you are not infinitely powerful16:01
*** jamesmcarthur has joined #openstack-infra16:01
*** ramishra has quit IRC16:02
dulekHi!16:02
dulekI'm trying to setup CoreDNS for K8s clusters deployed by DevStack in kuryr-kubernetes gates.16:02
dulekIs there any chance to learn upstream DNS server address in DevStack?16:02
dulekAt the moment I'm looking at https://git.openstack.org/cgit/openstack-infra/system-config/tree/playbooks/group_vars/dns.yaml, but should I use dns_master or one of dns_notify?16:03
fungidulek: i saw your question earlier in scrollback, probably better to start with the problem than the solution you've jumped to16:03
dulekfungi: Sure thing! I'm setting up coredns pod to serve as DNS for pods.16:03
toskydmellado: https://matrix.org/blog/2015/06/22/the-matrix-org-irc-bridge-now-bridges-all-of-freenode/16:03
dulekfungi: Thing is, coredns needs to forward "outside-of-the-cluster" DNS queries to a DNS server that can resolve them.16:04
dulekfungi: Thing is - in gate's DevStack VM I only have 127.0.0.1 put into /etc/resolv.conf.16:04
dulekfungi: So that won't work from the coredns pod.16:05
AJaegermordred: ;)16:05
dulekfungi: So my too ideas at the moment is either to figure out the "real" DNS server and set coredns to forward there.16:05
dulekfungi: Or run coredns pod on hostNetworking, bind it to 127.0.0.<whatever> and forward to 127.0.0.1.16:06
fungidulek: challenges there are likely going to be related to the variety of network topologies we have in different service providers. in some cases we need to prefer resolving via ipv6 because all ipv4 egress is through a single nat which can get easily overwhelmed with too many simultaneous requests, while in other providers we have no global ipv6 routes at all and must use ipv4 for dns resolution16:06
dulekThe latter is not really appealing because that coredns is supposed to sit behind K8s Service and running it with hostNetworking would make stuff harder.16:07
*** ykarel_ has joined #openstack-infra16:07
clarkbdulek: fungi the unbound config should have already sorted ou the ipv4 vs ipv6 for you16:07
clarkbits a bit implementation specific but reading that config file might be simplest16:07
mordredyeah. if you read the unbound config you could get the info you need16:07
dulekfungi: Is that why that unbind instance is deployed on gate VMs?16:08
fungiright, my point was it's likely going to be better if coredns can forward its queries to the local unbound service16:08
dulekfungi: That would definitely be best.16:08
clarkbfungi: yup I agree, unfortunately lxc is the only container runtime that seems to also agree16:08
fungidulek: yes, that and to reduce the volume and latency for repeated queries16:08
clarkblxc runs a dnsmasq with an interface in both network namepaces to bridge between containers and host resolver16:09
clarkbdocker punts to google if you have localhost set as resolver16:09
mordredcause that's a good default behavior16:09
fungii don't suppose docker has improved that at all16:09
*** ykarel|afk has quit IRC16:09
clarkbI think k8s expects you to configure dns within k8s (aka read the unbound config and set it as appropriate)16:10
mordredyeah16:10
dulekclarkb: Ha, nice to know. In our case we're running K8s with Neutron ports serving the pods.16:10
mordredand that's what dulek is trying to set up and test16:10
mordredlike - the k8s version of what we're doing with unbound16:10
clarkbmordred: yup16:10
fungiso the job would need to be able to parse the unbound config and generate a similar kubernetes configuration i guess?16:11
mordredso - perhaps one option is "read the unbound config" - but maybe writing a similar role that provides the appropriate input variables for an in-k8s coredns setup wouldn't be a bad idea *waves hands*16:11
fungior do we expose the values we're setting in the unbound configuration as ansible variables?16:11
clarkbinfra-root I've just put afs01.ord and afsdb01.o.o in the puppet emergency file. The first because it is the server I want to upgrade today and the second so I can disable docs afs publsihing during this process16:12
dulekCan I even see the unbound config? I mean - how complicated it is.16:12
dulekIn the end I was only looking for a single address to forward too. :P16:12
clarkbon mirror-update I intend to hold all the cron locks rather than disable crons there16:12
clarkbthe last piece I need to consider is the wheel publishing jobs16:13
clarkbthoughts on whether we want to pull those out of zuul or just let them potentially fail for a bit?16:13
clarkbdulek: its pretty simple16:13
clarkbdulek: its a plan text file with lines that say server: $ipaddress iirc16:13
* clarkb finds details16:14
mordreddulek: the configure-unbound role in openstack-zuul-jobs is where the magic i16:14
mordredis16:14
openstackgerritAdam Coldrick proposed openstack-infra/storyboard-webclient master: WIP: Automatically add security teams to security stories  https://review.openstack.org/64207116:15
clarkbdulek: /etc/unbound/forwarding.conf is the file and '  forward-addr: $NODEPOOL_STATIC_NAMESERVER_V6' is what the lines look like16:15
mordredclarkb: we could maybe zuul_return unbound_primary_nameserver and unbound_secondary_nameserver ...16:15
clarkbdulek: there may be more than one entry in that16:15
* dulek thanks anyone who decided to put codesearch.openstack.org up.16:15
clarkbmordred: I don't really want to make that a real interface. The way things should work is you use host dns or if you think you are smarter than us figure it out16:16
mordredclarkb: I hear that - but ever user doing containerized things is going to hit this issue16:16
fungidulek: that was mostly done by taron, an outreachy intern who worked with us a few years back16:16
clarkbmordred: unless they use lxc like osa16:16
mordredclarkb: and as much as I thnik the container ecosystem has done this incorrectly, me thinking that isn't going to make things work16:16
openstackgerritMatt Riedemann proposed openstack-infra/elastic-recheck master: Remove query for bug 1806126  https://review.openstack.org/64250816:16
openstackbug 1806126 in OpenStack Compute (nova) "LibvirtRbdEvacuateTest and LibvirtFlatEvacuateTest tests race fail" [High,Fix released] https://launchpad.net/bugs/1806126 - Assigned to Matt Riedemann (mriedem)16:16
clarkbmordred: ya I guess the real issue here is knowing v4 vs v6 addr16:17
mordredso I don't think it's unreasonable for us to provide some sort of interface that people doing container-based things can use to get dns going properly, right?16:17
mordredclarkb: yeah16:17
*** yamamoto has quit IRC16:17
mordredclarkb: of course, that's probably complicated by whether or not the container subsystem has been configured to understand ipv6 :)16:17
clarkbmordred: yes16:18
dulekclarkb: Okay, so I assume first "forward-addr:" for zone "." is the one I'm looking for?16:18
clarkbwhcih is mind boggling that these so called cloud native tools can't ipv616:18
clarkbdulek: ya first or second shouldn't really matter16:18
mordredclarkb: IKR?16:18
clarkbdulek: I think we prefer cloudflare to google so first will always be cloudflare then second will be google16:19
clarkbbut both are expected to work (and unbound round robins queries between them)16:19
openstackgerritThierry Carrez proposed openstack/ptgbot master: The PTG is no longer OpenStack-specific  https://review.openstack.org/64250916:19
openstackgerritThierry Carrez proposed openstack/ptgbot master: Update links to point to gitea  https://review.openstack.org/64251016:19
dulekclarkb: Okay, but that'll sometimes be IPv6, right? I doubt Kuryr's DevStack plugin is able to do IPv6, but well.16:19
dulekclarkb: Oh, I've actually tried forwarding to 8.8.8.8 and it haven't worked on one RAX instance.16:20
clarkbdulek: yes because we haev ipv6 only clouds16:20
clarkband on those clouds we prefer ipv6 dns because ipv4 has to go through a shared nat16:20
clarkbwhich mostly works but will sometimes fail16:20
dulek:)16:20
openstackgerritHelena proposed openstack-infra/project-config master: Adding zuul jobs for rsd-virt-for-nova repo  https://review.openstack.org/64250016:20
dulekOkay, thanks folks, I guess I have enough info to try this. :)16:22
*** pcaruana has quit IRC16:23
clarkbmordred: thinking about the ipv6 in containers problem we probably want any zuul_return or similar to list dns_primary dns_secondary as what we think you need to use based on ip version of local cloud region then also set a v4 and v6 pair of variables so that if you have cloud native tooling you can ipv4 only16:23
*** pcaruana has joined #openstack-infra16:23
*** ykarel_ is now known as ykarel16:23
mordredclarkb: ++16:23
clarkbinfra-root re disabling wheel build jobs afs01.ord doesn't host any of the wheel volumes so I think I am going to not disable the jobs for this first host upgrade16:25
*** yamamoto has joined #openstack-infra16:25
mordreddulek: fwiw - this could be a good opportunity to get kuryr's devstack plugin to be able to do ipv6 :)16:26
fungiespecially if this resulted in a job which exercised it with ipv616:26
dulekmordred: Ha, sure thing, this will need to happen, but I'll look at that in 2 months. Until then I'm stuck in Spain, with ADSL internet without IPv6 enabled. :D16:28
dulekAnd yes, I like to develop on local VM. ;)16:28
mordreddulek: yay local development!16:29
*** ykarel is now known as ykarel|away16:29
*** pgaxatte has quit IRC16:30
*** mattw4 has joined #openstack-infra16:30
*** trown is now known as trown|lunch16:30
clarkbre ipv4 through single nat address. I think we should do everything we can to avoid using that interface but we know there are things that leak through it and if we've done what we can to reduce throughput as much as possible that should make ipv4 as reliable as possible16:31
clarkbthat is a really long way of saying "it should be ok to use ipv4 where you don't have another option but where there is the option we should prefer ipv6"16:31
clarkbinfra-root I also added step of moving /etc/openafs aside before doing the upgrade because the post reboot upgrade reenables openafs-fileserver and I don't want it attempting to join our cell without its vicepa volume mounted and happy16:32
clarkbbasically I'm trying to decouple the process of upgrading from the process of starting openafs back up again and making sure it works so that we can check the upgrade is happy before checking openafs is happy16:33
clarkbI expect that I will get started on that later this morning after I've caught up on email and such16:34
clarkbhttps://etherpad.openstack.org/p/afs-fileserver-trusty-to-xenial is my documented process if anyone wants to loko that over before I jump into the deep end of the pool :)16:35
*** e0ne has quit IRC16:41
ildikovHi, I have a quick question looking for guidance. The StarlingX team would like to setup a dashboard for the test team like this one: http://reportportal.io16:44
dulekHey, one more question - if I read https://nlnetlabs.nl/documentation/unbound/unbound.conf/ correctly, by default unbound will only listen to queries from localhost, right?16:44
clarkbdulek: yes, our unbound only listens on localhost to avoid having an open resolver on the internet16:44
ildikovI wonder if there's any way to do it within our infra?16:44
dulekSo this will mean that it won't work for me to query it from inside my pods…16:45
*** helenafm has quit IRC16:45
clarkbdulek: we also have iptables firewall rules but container infrastructure tends to say no to those and then open things up in ways we don't want16:45
clarkbdulek: unless you can use lxc as a runtime (I don't know if k8s supports that)16:45
*** nsmeds has left #openstack-infra16:45
clarkbildikov: elastic-recheck is our tooling most similar to that16:45
mordredildikov: so - we probably want to dig in a little bit in to what such a thing is trying to accomplish and how it fits in to a zuul world16:46
mordredalso what clarkb said :)16:46
clarkbildikov: and tristanC has a spec up to do the ml part16:46
dulekclarkb: Nah, no way, we support Docker and cri-o as container engines.16:46
*** gyee has joined #openstack-infra16:46
clarkbildikov: http://status.openstack.org/elastic-recheck/ we write elasticsearch queries to fingerprint bugs then twice and hour we scan our elasticsearch database for occurences of those fingerprints and publish the reports on that page16:47
iurygregoryHey everyone, quick question to install everything udner Python3 i just need to set USE_PYTHON3=True on local.conf for devstack? in my patch trying to enable i think its installing python2.7 http://logs.openstack.org/74/642474/5/check/ironicclient-functional/b3d9e17/job-output.txt.gz#_2019-03-11_15_53_48_812130 even if in the job config is set https://review.openstack.org/#/c/642474/5/zuul.d/ironicclient-jobs.yaml16:47
clarkblooks like indexing is behind again. I'm going to guess things have OOM'd16:47
openstackgerritMerged openstack-infra/elastic-recheck master: Remove query for bug 1806126  https://review.openstack.org/64250816:47
openstackbug 1806126 in OpenStack Compute (nova) "LibvirtRbdEvacuateTest and LibvirtFlatEvacuateTest tests race fail" [High,Fix released] https://launchpad.net/bugs/1806126 - Assigned to Matt Riedemann (mriedem)16:47
*** agopi|brb is now known as agopi16:47
clarkbiurygregory: probably a better question for the qa channel. My understanding is that for libs (like ironicclient) devstack will install the lib under python3 and python216:48
ildikovclarkb: mordred: thanks! I think one of the challenges is that the team is having builds and sanity, robustness, etc testing which may not be integrated with Zuul at this point, but I'll point them to elastic and see if we can converge somehow16:48
clarkbiurygregory: however, fungi has noticed recently that this may not work 100% as expected16:48
iurygregoryclarkb, ty o/16:48
iurygregoryops =X16:49
mordredildikov: yeah - first step is to get any testing that is not integrated with zuul integrated with zuul :)16:49
clarkbiurygregory: one possibility is that the python3 install that happens after python2 install isn't overwriting /usr/local/bin entries but I don't think anyone has run it locally and checked16:49
openstackgerritTobias Henkel proposed openstack-infra/zuul master: Prevent local code execution via the raw module  https://review.openstack.org/64251816:49
rpittauclarkb, can be this the culprit? https://github.com/openstack-dev/devstack/blob/master/inc/python#L43216:49
iurygregoryclarkb, gotcha, i will try to debug some more and see with qa too =)16:50
mordredildikov: but - looking at that tool - we have all of the functionality already integrated between elastic-recheck and http://status.openstack.org/openstack-health/#/16:50
clarkbildikov: to use that as a release health indicator I like to ensure the classification percentage is high (meaning we've identified why most jobs fail say 90%). Then you can look at the occurence of specific bugs to determine how healthy the project is overall16:50
mordred++16:50
clarkbrpittau: ya you can see it installs it twice. First under python2 then under python3, but in some cases we've noticed that the /usr/local/bin/ entrypoints seem to want to use python2 not 316:51
clarkbrpittau: it is possible that there is a bug in pbr writing out the entrypoint script content's shebang line or that pip isn't overwriting the netry16:52
*** udesale has quit IRC16:52
*** electrofelix has quit IRC16:52
clarkbI reviewed the pbr code and fungi tested it and it seemed to do the right thing. My current guess is pip doesn't overwrite for some reason. POssibly because we are editable?16:52
ildikovmordred: clarkb: got it, thank you :)16:53
mordredildikov: happy to have further chats about it as needed of course!16:54
*** priteau has quit IRC16:54
openstackgerritMerged openstack-infra/system-config master: Split python-base into its own Dockerfile  https://review.openstack.org/63253216:55
*** e0ne has joined #openstack-infra16:57
*** noama has quit IRC16:58
ildikovmordred: sounds good, thank you17:00
fungiclarkb: the other possibility (in the case of the job i was looking at over the weekend there were tracebacks indicating privsep-helper was sometimes called under 2.7 and sometimes under 3.x) is that we're invoking it differently in different places17:01
fungione thing i want to try is a dnm change to rip out the conditional block at http://git.openstack.org/cgit/openstack-dev/devstack/tree/inc/python#n441 and then do some depends-on to see what happens with those current failure cases17:04
fungithough odds are someone else will get to that before i have time for it17:04
pabelangerclarkb: mordred: fungi: Not sure you seen on friday, but https://review.openstack.org/642100/ starts the process of creating zuul specific tenant for zuul17:05
clarkbre e-r being behind it appears the e-s cluster is red due to a lost shard. This seems to have made logstash unhappy?17:06
fungigreat!17:06
clarkbI think maybe if logstash is trying to write to that broken shard it blocks17:06
*** ginopc has quit IRC17:07
clarkbthe shard belongs to the 8ths index so if I restart things we should roll forward so I'm going to restart some things and see if we can get it past that17:07
clarkbactually17:07
clarkbwe may just want to delete that index entirely to avoid trouble17:07
clarkbI'm going to go ahead and do that17:08
*** rpittau is now known as rpittau|afk17:09
*** wolverineav has joined #openstack-infra17:09
*** kopecmartin is now known as kopecmartin|off17:10
clarkbthat seems to have gotten things moving. We'll have a hole but not sure we could avoid that given the broken index17:11
*** priteau has joined #openstack-infra17:12
*** e0ne has quit IRC17:13
*** wolverineav has quit IRC17:14
*** tosky__ has joined #openstack-infra17:14
*** tosky has quit IRC17:14
*** rfolco has joined #openstack-infra17:14
*** e0ne has joined #openstack-infra17:14
*** rfolco|ruck has quit IRC17:16
openstackgerritClint 'SpamapS' Byrum proposed openstack-infra/nodepool master: DNM: Pin to Kubernetes 9 beta until it releases  https://review.openstack.org/64252417:18
*** tosky__ is now known as tosky17:19
*** dtantsur is now known as dtantsur|afk17:19
*** jpich has quit IRC17:29
*** ykarel|away has quit IRC17:29
*** e0ne has quit IRC17:31
*** mriedem is now known as mriedem_afk17:31
*** iurygregory has quit IRC17:35
*** jamesmcarthur has quit IRC17:40
*** jamesmcarthur has joined #openstack-infra17:41
openstackgerritMerged openstack-infra/zuul master: Prevent local code execution via the raw module  https://review.openstack.org/64251817:45
*** jamesmcarthur has quit IRC17:45
clarkbalright meeting agenda is sent. I'm going to find food then when I'm back start on afs01.ord upgrade17:47
*** trown|lunch is now known as trown17:49
fungibtw, topic:xenial-upgrades has a couple patches to move us forward on the wiki-dev upgrade17:51
clarkbfungi: great I'll take al ook before diving into afs upgrades then17:51
openstackgerritMerged openstack-infra/system-config master: Use opendev logos  https://review.openstack.org/64217917:55
*** derekh has quit IRC18:00
openstackgerritMerged openstack-infra/openstack-zuul-jobs master: Add Python3 project templates for Train release  https://review.openstack.org/64187818:06
*** wolverineav has joined #openstack-infra18:10
*** dpawlik has joined #openstack-infra18:11
openstackgerritDavid Shrewsbury proposed openstack-infra/zuul-preview master: WIP: Begin refactoring code for unit testing  https://review.openstack.org/64224518:13
*** jpena is now known as jpena|off18:13
*** wolverineav has quit IRC18:15
*** wolverineav has joined #openstack-infra18:15
*** wolverineav has quit IRC18:15
*** wolverineav has joined #openstack-infra18:15
openstackgerritDavid Shrewsbury proposed openstack-infra/zuul-preview master: WIP: Begin refactoring code for unit testing  https://review.openstack.org/64224518:18
clarkbfungi: +2 on both. You mgiht also want a change to bump the testing of those nodes to xenial in manifests/site.pp18:18
* clarkb attempts to grab locks on mirror-update.o.o18:19
*** jamesmcarthur has joined #openstack-infra18:20
*** panda|rover is now known as panda|rover|off18:20
*** e0ne has joined #openstack-infra18:23
*** e0ne has quit IRC18:25
*** jcoufal has quit IRC18:25
*** jamesmcarthur has quit IRC18:26
clarkbinfra-root I've grabbed all locks on mirror-update and disabled the docs release cron on afsdb0118:27
clarkbI am going to proceed with upgrading afs01.ord.o.o in place now18:27
openstackgerritJeremy Stanley proposed openstack-infra/zuul-jobs master: [DNM] exercise base-test as parent in unittests  https://review.openstack.org/64253618:27
clarkbok prep work is done and I'm ready to run do-release-upgrade now. I thought about doing a snapshot but I've got the snapshot I already made which should be recent enough18:33
clarkbhere goes18:33
openstackgerritJeremy Stanley proposed openstack-infra/zuul-jobs master: [DNM] exercise base-test as parent in unittests  https://review.openstack.org/64253618:36
openstackgerritJeremy Stanley proposed openstack-infra/zuul-jobs master: [DNM] exercise base-test as parent in unittests  https://review.openstack.org/64253618:39
*** diablo_rojo has joined #openstack-infra18:40
*** eernst has joined #openstack-infra18:45
*** jamesmcarthur has joined #openstack-infra18:47
*** eernst has quit IRC18:50
clarkbwaiting on dkms to build kernel modules18:53
clarkbotherwise upgrade has gone exactly as I've documented it so far18:53
*** jcoufal has joined #openstack-infra18:54
*** priteau has quit IRC18:55
*** e0ne has joined #openstack-infra18:57
*** eernst has joined #openstack-infra18:58
*** aojea has quit IRC19:00
*** kgiusti has left #openstack-infra19:02
*** eernst has quit IRC19:02
*** mriedem_afk is now known as mriedem19:04
*** eernst has joined #openstack-infra19:04
*** jcoufal has quit IRC19:06
clarkbI'm doing what should be the last reboot now19:08
clarkbssh isn't coming back quickly making we wonder if it is fscking the vicepa volume19:08
*** eernst has quit IRC19:09
* clarkb waits patiently (fwiw I have rebooted it post upgrade already so I'm fairly certain the new kernel and all that work and I've mounted the vicepa volume manually before rebooting so expect that to work too)19:09
clarkbhowever it does not ping and I'm not sure if I should expect it to ping during boot if fscking19:09
clarkbconsole shows the ubuntu 16.04 boot splash iirc that overlays any fsck output19:12
clarkband the little dots are changing color so it is doing something19:13
clarkbinfra-root ^ let me know if you think I should do something other than be patient under the assumption fsck is in control19:13
mordredclarkb: I think patience is the best bet19:14
*** yamamoto has quit IRC19:14
fungiit's almost certainly the dkms build(s)19:17
fungihow many kernels were installed?19:17
fungioh, wait, that was before you rebooted though19:17
clarkbyes the upgrade fully completed and we did a reboot then19:17
fungiyeah, the boot process should eventually time out if there's a problem waiting for a block device to become available or something19:18
clarkbthat came back, I fixed puppet apt, reinstalled apt, and then set up openafs configs and remounted vicepa. Then rebooted again (this is where I'm stuck)19:18
clarkbinternet suggests an f2 might escape the splash screen19:18
clarkbhrm f2 shows me what looks like output from the previous shutdown which isn't what I really want19:19
fungiis it not done halting yet maybe?19:20
fungior... i know they're doing a mass migration in their dfw region... did you maybe translarently trigger a cold instance migration by rebooting?19:20
fungier, transparently19:21
clarkbmaybe?19:21
corvussometimes 'esc' get you out of the splash screen19:21
fungiprobably need to review open tickets in our tenant there to find out if they're trying to get us to reboot some so they'll migrate19:21
clarkbcorvus: ya esc toggles back and forth. I'm beginning to wonder if fungi's theory is the one19:21
clarkbit does appear we are actually stuck on the shutdown side of the reboot and that may be because we haven't booted our new state on new hypervisor yet19:22
clarkbwould also explain why ping doesn't work because the network stack is off19:22
clarkbfungi: I don't see any open tickets though19:23
fungihrm...19:23
*** xek has quit IRC19:23
clarkbthis server is in ord though19:23
clarkbdoesn't look like the ticket system is region specific19:23
*** xek has joined #openstack-infra19:23
fungioh, nevermind. i hadn't heard anything about ord migrations anyway, just dfw19:23
clarkbpart of me wants to reboot it from the openstack side of things19:24
fungithere are a couple tickets for that tenant titled "[ACTION REQUIRED] DFW Datacenter Migration" it seems19:24
clarkbas far as the openstack api is concerned the server is ACTIVE19:24
fungiand also one "Cloud Server Incident Notification" ticket open19:25
clarkbya I think an api reboot is the next thing19:25
clarkbinfra-root ^ any objections or alternative suggestions?19:25
corvusclarkb: afsd wasn't running, correct?19:26
clarkbcorvus: bosserver/openafs-fileserver was not running19:26
fungithe cloud server incident was for the old pre-upgrade graphite server so i closed it19:26
clarkb(aiui the openafs-fileserver service runs bosserver which runs the other services as child processes)19:27
*** eernst has joined #openstack-infra19:27
fungithe datacenter migration tickets were for ask-staging and afs02.dfw19:27
fungiask-staging was migrated for us already so i closed that ticket19:27
corvusthen that seems pretty safe to me.  even if the ext4 partition is still mounted, ext4 should be able to handle that.  i'd worry a bit more (like, would we have to do a volume recovery) if afsd were running.19:27
clarkbcorvus: got it, ya I'm fairly certain I managed to stop those services19:28
fungiafs02.dfw will be rebooted for us at or after "March 09 2019 01:29 UTC19:28
*** eernst has quit IRC19:28
clarkbproceeding with the reboot now19:28
*** eernst has joined #openstack-infra19:28
fungiuptime on afs02.dfw is showing 6 days, so i don't think they've rebooted it (yet)19:28
clarkbok server is back up. vicepa is mounted, bosserver is running and afsdb01 bos status says running normally19:30
clarkbcorvus: thoughts on other stuff to check before I turn on puppet and release some lock files19:31
clarkbvos listvldb doesn't show anything sad about the ord volumes that I can see either19:32
clarkbI am going to reenable puppet which will ensure that half of things is happy too. Then once we are overall happy we can do afs02.dfw19:33
clarkbactually you know I've seen similar reboot/shutdown behavior locally with libvirt where it gets out of sync with the virtual acpi19:34
clarkbI wonder if new kernel doesn't play nice with xen/rax19:34
clarkband maybe we need to install some package we'd normally get via rax images19:34
clarkbwe can do a dpkg listing and compare to rax xenial image19:34
clarkbI have reenabled puppet on afs01.ord19:35
*** eernst has joined #openstack-infra19:36
fungiperhaps the rackspace/xen agent we've got installed is incompatible with newer kernels and needs a different version obtained from somewhere? seems unlikely, but possible i suppose19:36
*** gfidente is now known as gfidente|afk19:36
*** bringha has joined #openstack-infra19:37
clarkbI see that docs has a backup version which I think means it is ready for a vos release? so once we reenable the cron for docs release we should see that everything is working overall19:37
clarkbI'll do that once puppet is happy19:37
clarkbfungi: ya this reboot was the first from new kernel to off. The previous reboot was old kernel to off to new kernel on19:38
fungiour instructions for updating the base job say to start by making sure base-test is identical to base... how are people generally confirming that?19:40
*** eernst has quit IRC19:40
*** e0ne has quit IRC19:40
fungiright now i've got a slew of post_failure job results when i test reparenting to base-test19:40
*** e0ne has joined #openstack-infra19:40
fungiand i realize i skipped that step19:40
*** eernst has joined #openstack-infra19:42
fungi`diff -ru playbooks/base{,-test}` does indeed have some bits i should check out19:42
clarkbfungi: I'm guessing manual diffs :/ fwiw that job would've been moved into opendev/base-jobs and it is possible that the move wasn't 100% correct for base-test19:42
*** xek has quit IRC19:43
*** eernst has quit IRC19:43
*** eernst has joined #openstack-infra19:43
*** xek has joined #openstack-infra19:43
*** eernst has quit IRC19:43
fungiit does look from git history as if the base-test playbooks were different when they were copied to the new repo19:43
*** eernst has joined #openstack-infra19:44
fungithis is the current diff between the playbooks: http://paste.openstack.org/show/747569/19:45
fungisomeone was testing some change to the upload-logs role19:46
clarkbfungi: I would probably start by reverting that?19:47
corvusyeah, just copy base to base-test19:47
fungiyeah, technically a `git revert ...` isn't going to work since this was carted in from a different repo19:48
corvusi don't think anyone's testing anything right now.  normally i'd check git log, but that's too much trouble with the repo move.  :)19:48
openstackgerritJeremy Stanley proposed opendev/base-jobs master: Reset base-test playbooks to match base  https://review.openstack.org/64255019:50
fungiclarkb: corvus: ^19:50
fungithanks! (to AJaeger too)19:54
fungionce it merges i'll hopefully be able to finish setting up the necessary chain to demonstrate unit tests running on the correct distros for different branches of a random openstack project19:54
*** jcoufal has joined #openstack-infra20:03
*** yamamoto has joined #openstack-infra20:06
openstackgerritMerged opendev/base-jobs master: Reset base-test playbooks to match base  https://review.openstack.org/64255020:06
*** bringha has quit IRC20:10
*** e0ne has quit IRC20:11
*** e0ne has joined #openstack-infra20:12
*** yamamoto has quit IRC20:12
*** e0ne has quit IRC20:15
clarkbpuppet ran with no change on afs01.ord20:17
clarkbI'm going to enable the docs publishing cron on afsdb01 now20:18
fungithat sounds awesome. do we want more reboot tests?20:19
ianwinfra-root / corvus : i'm at the point of being ready to send some git:// to https:// changes, can you review this commit message :20:25
ianwhttps://git.openstack.org/cgit/openstack-infra/system-config/tree/tools/mass-git-change/replace.sh?h=refs/changes/14/642314/3#n2020:25
clarkbfungi: ya we can disable the bosserver via afsdb01 after docs release situation is happy and then do another reboot20:26
clarkbfwiw we've releases docs but I still see RWrite: 536870991     ROnly: 536870992     Backup: 53687099320:26
clarkbI had assumed backup being higher than the other two meant we need to release to catch up but maybe that isn't what it means20:27
clarkbcorvus: ^ do you know what that means off the top of your head?20:27
corvusclarkb: those are just volume ids, they don't change20:28
clarkbcorvus: got it, any good way of checking vos release being successful?20:28
clarkb I guess I can track a docs change and see it on the web browser side of things20:28
clarkbianw: lgtm20:28
corvusclarkb: yes, 'vos examine' should tell you20:29
corvusclarkb: i'm pretty sure if one of the sites is behind, it says so20:29
clarkbcorvus: thanks20:29
corvusclarkb: you can 'vos examine docs.readonly' to see the "last update" time20:30
corvusthat'll be the last time it was released20:30
clarkblast update was 5 minutes ago20:32
corvus(you can also 'vos examine 536870992' since that's the volume id for the read only volume)20:32
clarkbhttp://paste.openstack.org/show/747574/ so that lgtm20:32
corvusclarkb: agree20:32
clarkbok I'll shutdown that server via the bos command on afsdb01 then reboot20:32
clarkbto see if we get a clean reboot this time. If we don't then I'll dig into dpkg diffs and see if there are any apparently rax image deltas that we might need to address20:33
mordredianw: lgtm20:34
ianwclarkb: fyi afsmon takes the creation date -> http://git.openstack.org/cgit/openstack-infra/afsmon/tree/afsmon/__init__.py#n86 , that's what builds the dashboard's "last release" time20:35
clarkbheh it might be doing a fsck this time20:36
clarkbI see /dev/xvda1: clean: etc20:36
clarkbbut I can ssh in so maybe it only fscked xvda120:36
clarkbbos status says it is back and running20:37
clarkbI think I am going to call afs01.ord good now20:37
clarkbI'm adding afs02.dfw.o.o to the emergency file now20:38
fungibe aware that's the one with the open ticket about the pending (though in theory scheduled for a couple days ago) reboot migration20:38
clarkbfungi: ya I've actually got reboot in the list of steps before we do the release upgrade20:39
clarkbhopefully that catches any pending migrations20:39
fungiwhich i suspect they either didn't perform or ended up not needing to perform given the listed uptime on the server20:39
openstackgerritClark Boylan proposed openstack-infra/project-config master: Disable wheel mirror updates for afs server upgrades  https://review.openstack.org/64256220:42
openstackgerritClark Boylan proposed openstack-infra/project-config master: Revert "Disable wheel mirror updates for afs server upgrades"  https://review.openstack.org/64256320:42
clarkbinfra-root can I get reviews on the first chagne there to avoid attempting to publish wheel mirror updates while I do afs02.dfw?20:42
clarkbI will WIP the second change20:43
clarkbto recap the upgrade on afs01.ord the only unexpected thing was the broken reboot after booting new kernel20:45
clarkbsubsequent reboots work20:45
clarkboverall relatively straightforward. Also the lack of puppet updates on the upgraded server implies I picked do-release-upgrade question answers properly :)20:46
clarkb(if I had overwritten a file managed by puppet it would update the file contents post upgrade)20:46
clarkbI've still got mirror-update locks held and have redisabled the docs publishing cron. Once the change above merges I'll be ready to do afs02.dfw20:47
fungianybody working on zuul executor restarts yet? looks like we've got the raw fix installed but the executors are still running on code from friday20:50
clarkbI am not. I've got afs things paged in and trying to run through that as much as possible20:50
corvusfungi: nope.  you want to do it, or shall i?20:52
*** jamesmcarthur has quit IRC20:56
*** e0ne has joined #openstack-infra20:56
clarkbhttp://git.openstack.org/cgit/openstack/openstack-ansible/tree/scripts/fastest-infra-wheel-mirror.py20:58
clarkbdiscovered ^ doing a code search on the wheel mirror jobs20:58
fungicorvus: i can start in a few minutes. do we have a specific playbook/procedure for that?20:59
*** luizbag has quit IRC21:01
clarkbfungi: we have a playbook to restart all the zuul services, you should be able to trim it down to just the executors21:02
openstackgerritMerged openstack-infra/project-config master: Disable wheel mirror updates for afs server upgrades  https://review.openstack.org/64256221:02
clarkbfungi: possibly by --limit ze*.openstack.org ?21:02
fungiyeah, last likely candidate i found in my shell history on bridge.o.o was `sudo ansible ze*.openstack.org -m shell -a 'systemctl restart zuul-executor'`21:03
clarkbI'm grabbing something to drink then starting with afs02. That merge was the last thing I needed in the disable things that might do stuff I don't want list21:04
fungii should have started drinking already! ;)21:04
*** mattw4 has quit IRC21:04
*** e0ne has quit IRC21:04
*** mattw4 has joined #openstack-infra21:05
corvusfungi: yeah, that systemctl command should be fine21:05
*** trown is now known as trown|outtypewww21:05
fungirunning that in that case21:06
clarkbheh not that kind of drinking.21:06
clarkbthough I grabbed a bottle of whiskey aged on oregon white oak beacuse its somewhat novel and actually not bad either21:06
funginow to check that the pidfile gets refreshed on them all21:07
*** e0ne has joined #openstack-infra21:12
fungino pidfiles on a lot of the executors now, checking one for sanity21:15
fungiActive: active (running) since Mon 2019-03-11 21:06:29 UTC; 9min ago21:16
fungilast entry in its debug log though is: 2019-03-11 21:10:43,743 DEBUG zuul.log_streamer: LogStreamer stopped21:17
corvusthey take a while to stop21:17
corvusfungi: maybe a separate stop and start are needed, depending on what systemd does with restart21:18
*** xek has quit IRC21:18
fungiwill give that a shot21:18
corvusfungi: they take ~10m to stop usually21:18
ianwclarkb: wow, that script is ... something21:19
*** mattw4 has quit IRC21:21
*** mattw4 has joined #openstack-infra21:23
fungiokay, all 12 executors now have a pidfile with a 21:20 timestamp21:23
corvushttp://grafana.openstack.org/d/T6vSHcSik/zuul-status?orgId=1 looks reasonable21:24
clarkbok afs02.dfw did something a little different than afs01.ord. It made me configure openafs client because I moved teh config aside21:24
clarkbI just went with the defaults since I will move the config back when bringing bosserver back online21:24
clarkb(it asked for cell name and cache size)21:25
*** whoami-rajat has quit IRC21:25
zbrianw: can you please comment on https://review.openstack.org/#/c/639951/ ?21:26
*** mattw4 has quit IRC21:28
*** mattw4 has joined #openstack-infra21:29
openstackgerritJames E. Blair proposed openstack-infra/zuul-jobs master: Add no_log entries to skopeo copy commands  https://review.openstack.org/64257421:29
fungiyeah, seems the executors are running normally once more (or at least coming back to normalcy)21:31
fungi#status log restarted zuul executors for security fix 5ae25f021:32
openstackstatusfungi: finished logging21:32
clarkbfungi: now is the part where I am waiting on dkms21:32
*** mattw4 has quit IRC21:32
fungifor the record, `systemctl restart zuul-executor` seems to only have stopped them, not started them21:33
*** mattw4 has joined #openstack-infra21:33
fungiyet registered them as running21:33
corvusmordred, tristanC, tobiash: when i visit https://opendev.org/ i see a gitea logo in the top left.  but when i shift-reload, it turns to the opendev logo.  but when i click 'home' it turns back to the gitea logo.  i'm pretty sure this is related to service workers.  any idea how to fix that (other than asking users to tell their browser to delete service workers?)21:33
fungithe separate stop and start corvus suggested did the trick though21:33
*** jcoufal has quit IRC21:36
*** dave-mccowan has quit IRC21:36
clarkbcorvus: mozilla bugzilla implies that for firefox you ahve to cleark those cache items through the "storage inspector"21:37
clarkbI odn't know what storage inspector is21:37
corvusyeah, anything that requires a user to do something is a no go :)21:38
*** e0ne has quit IRC21:38
corvusis the local storage cleared when the service worker is updated with a new version?  does it ever clear the local storage for any (other) reason?21:40
*** pcaruana has quit IRC21:45
ianwzbr: hrrm, my immediate feeling is that i don't think the package manager is a great thing to switch on.  i understand it happens to be a proxy for the release of the platform21:46
ianwbut across dib, devstack, etc we just do YUM= and select either yum or dnf ... for all intents they're compatible21:47
zbrianw: i think that the ultimate question is which "atom" name to use to identify the python3-first platforms.21:47
zbrusing "distro-ver" does not scale at all, bindep.txt files would become at total mess soon if we start adding versions.21:48
zbri have no strong feelings for the name, just want to find one that we can all agree, so we can start using it.21:49
clarkbfwiw I don't know that it will be that bad21:49
clarkbonce you cross the threshold you can clean up the legacy stuff21:49
zbrclarkb: based on my prev experience some distros are really hard to kill, ;)21:50
zbrthere is always someone still wanted them listed, aka super-extra-extended-support kind of...21:50
clarkbinfra-root afs02.dfw.openstack.org is upgraded and up and running with a happy status report from bos status. Now a question of process. Would you like me to reenable all of the publishing of volumes then let afs burn in until tomorrow morning when I upgrade afs01.dfw.o.o. This should give us a high degree of confidence everything works as expected. Or do you think I should just go ahead and upgrade21:53
clarkbafs01.dfw.o.o now with the locks and jobs disabled?21:53
*** dpawlik has quit IRC21:53
clarkbafs01.dfw is our RWrite volume server for all volumes fwiw21:53
corvusclarkb: i'm not sure burn in will increase confidence by much, so if you want to plow through, that sounds reasonable.21:54
corvususually if it breaks, it breaks immediately and hard :)21:54
clarkbcool I'll proceed then21:54
clarkbrather get this done than wait longer21:54
fungiwfm21:54
clarkbI've put afs01.dfw in the puppet emergency file. Before I start on that server I'll disable afs02.dfw again and reboot it again to make sure that subsequent reboots work as expected (they did on afs01.ord)21:55
*** jtomasek has quit IRC21:56
clarkbzbr: ya I guess I'm thinking centos 7 is the only one that is python221:57
clarkbzbr: so you'll ahve to set centos7 packages and not centos7 pacakges which isn't so bad21:57
clarkband it is happy after that last reboot21:57
zbrclarkb: unless you also count for rhel-7... and we already have two. others may start to popup... i don't know.21:58
ianwzbr: can we explore the problem with some full examples @ https://etherpad.openstack.org/p/bindep-python322:05
*** rfolco is now known as rfolco|ruck|off22:05
ianwzbr: but then the issue is adding in rhel8, right?22:07
zbrianw: yeah, i am trying to find a way to avoid all extra maintenance work.22:08
ianwzbr: so when it comes down to it, your major concern is that [platform:fedora platform: centos-8 platform:rhel-8] is not forward compatible?22:13
zbrianw: yeah. alternative would be using centos-7 and rhel-7 as conditions, with negative matches.22:15
clarkbbuilding dkms things on afs01.dfw as part of the upgrade now22:15
fungihow does platform:redhat differ from platform:rpm in the absence of non-redhat-derived rpm-based distros?22:15
fungilike, until a project decides to add, say, suse support you can always just operate off the assumption that platform:rpm means "all the rpm-based platforms i expect to support right now"22:16
zbrianw: btw, what are the returned values for rhel now? (i do not have one to ssh to it right now)22:16
ianwzbr: i don't know, i'm just guessing22:16
clarkbthere are rhel examples in the bindep test suite22:16
clarkbfor 7 not 822:16
clarkbworkstation and server22:16
*** eernst has quit IRC22:17
ianwyeah, it is "rhel"22:17
zbrbut we can safelyuse "rhel". do we also have a rhel-7 atom?22:17
clarkbyes you should have both22:18
ianwi think i'm feeling the issue, that we need to "explode" platform:rpm !platform:fedora to explicit matches22:19
ianwwhat if we supported a "+"?22:20
funginot 100% sure what the thrust of this exploration exercise is, but keep in mind bindep was designed with the expectation that you use the broadest possible platform profiles until someone tells you they're insufficient for a given platform, because in many cases the package name is identical across all distros, next most common is that rpm-based distros tend to use one naming convention while22:21
zbri updated the etherpad with my practical example for molecule. it should work but not tested yet.22:21
fungidpkg-based distros use a different convention but within their general family they still all use the same name for that package... least most common is having to handle package name differences between distros in the same family or between releases of the same distro (i.e. packages getting renamed/replaced)22:21
zbrfungi: yeah, in fact most tools share the same names, like "git", "gcc".22:23
clarkbianw: that becomes tricky with suse going from 42 to 15, but that might help overall22:23
fungizbr: leaving me to wonder why you have, for example, python-pip [platform:redhat platform:dpkg]22:23
fungiwhy is that not just python-pip with no platform profiles?22:23
ianwfungi: well i think on that etherpad page, when we add in rhel-8, how do we keep things relatively sane with different python package names?  that's a constrained problem22:24
fungioic, packages getting renamed to python3-pip?22:24
zbrfungi:  there is no such thing as pytho-pip on brew (macos).22:24
ianwyeah, then you want python3 on fedora,centos-8,rhel-8 but not <= centos-722:25
zbrfungi: is better to miss one distro than adding a false dependency.22:25
fungizbr: and that list you have there i expected to work on brew?22:25
fungi"better to miss one distro than adding a false dependency" is sort of the opposite of how bindep is designed to be used22:25
*** tosky has quit IRC22:26
zbrfungi: to answer your question, brew is not listed because it didn't had time.22:26
fungias i said, the expectation is you use the broadest possible platform profiles (or none at all) until someone proposes making them more granular to support an additional platform with the list of packages you have22:26
ianwclarkb: sorry, is docs.openstack.org giving forbidden atm?22:26
fungiso, yeah, if you're working on making the list there also support brew then it makes sense, that wasn't clear to me from context22:27
fungiianw: confirmed22:27
fungiclarkb: hiccough with the afs upgrade?22:27
clarkbnot that I've seen on my side yet22:28
clarkbit is doing dkms things22:28
johnsomUmmm, https://docs.openstack.org/octavia/latest/ is giving Forbidden -You don't have permission to access /octavia/latest/ on this server.22:28
fungii'm getting 403 forbidden at https://docs.openstack.org/ currently22:28
clarkbI wonder if we are serving from the rw volume instead of the ro22:28
johnsomAh, ianw got here first...22:28
fungiDocumentRoot /afs/openstack.org/docs22:29
clarkbhrm22:29
fungiaccording to the vhost22:29
clarkbdoes that imply afs01.ord isn't actualyl working?22:29
clarkbsince it hosts the other RO volume for docs22:29
fungils: cannot access '/afs/openstack.org/docs': Connection timed out22:29
clarkbfwiw zuul-ci.org does load, is served by the same apache but afs02.dfw serves as backup RO volume22:30
clarkbI'm most of the way through the upgrade on afs01.dfw22:30
clarkbI think I should continue? that will get docs back when afs01.dfw is back22:30
fungiyeah, i think we have some documentation on how to swap those22:30
openstackgerritMerged openstack-infra/zuul master: Increase timeout of test_plugins  https://review.openstack.org/64180322:30
openstackgerritMerged openstack-infra/zuul master: Fix test race in test_container_jobs  https://review.openstack.org/64179122:30
clarkbfungi: fwiw that path is the RO path aiui22:31
clarkbfungi: which implies afs01.ord isn't serving the data I think22:31
clarkbbos status says afs01.ord is running normally22:31
fungihrm, yep22:33
clarkbiptables shows the expected ports are open22:33
fungiahh, found, the docs we have are for vos move actions22:33
clarkbfungi: I'm not sure a vos move will help?22:33
clarkbI wonder if the client on files has stale info so its looking for the data only on the shut down server?22:33
clarkbok afs01.dfw is ready for rebooting to come up on the new kernel.22:34
clarkbI'm going to tell it to do that now then we can get things set up to serve the content for docs too22:34
fungithis is what files02 sees: http://paste.openstack.org/show/747582/22:35
fungi`vos listvol -server afs01.ord.openstack.org` also shows docs.readonly as On-line22:37
ianwfungi: i wonder if it's because it's lost contact with ord previously, then it's back, then dfw has gone22:39
ianwi.e. deep openafs bug22:39
fungilike if we'd waited longer in between it would have been seamless?22:39
ianwaka, turn it on and off again, and maybe it works22:39
clarkbafs01.dfw should be back soon22:40
ianwfungi: like something in the state machine of what to access has gone wrong; cause yeah, vos looks likes it's ok, but ls /afs/openstack.org is not22:40
clarkbafs01.dfw is back22:41
clarkband so is docs22:41
clarkbI would like to reboot afs01.dfw one more time to ensure it works without a hard reboot22:41
fungiconfirmed, it started working as soon as afs01.dfw returned22:41
clarkbshould I go ahead and do that knowing that we may take another docs.o.o outage?22:42
fungisure, unless we want to use that to diagnose what's going on with files0222:42
clarkbfungi: ianw maybe you can restart afsclient services on files02 after I stop the bosserver on afs01.dfw?22:42
fungiworth a shot22:42
clarkbfungi: I'm guessin that ianw's hunch is not far off22:42
ianwwe can try ... although it might be kernel-level ish22:42
clarkbbasically the client side has gotten confused about where it should get the data from22:42
clarkbianw: fungi ok let me know when I should stop afs01.dfw for its last reboot22:43
fungigo for it. i've got the commands to stop/start the openafs-client service queued up22:43
ianwi think now, and yeah let's try openafs-client stop/start on files02 and see if it works22:44
ianwfungi: ++22:44
clarkbok it is stopped22:44
fungithings are still working22:44
clarkbwhy don't you check things work on files before I reboot afs01.dfw since that will race22:44
fungimaybe cached?22:44
clarkboh ya could it be that we hit the cache timeout?22:44
fungii'm browsing around the site successfully still22:45
clarkbshould I hold off on the reboot? the service will auto start on boot so want to awit until you think you are done debugging22:45
clarkbalso possible that dfw coming back kicked the state machien in the client22:45
clarkbso it knows to fall over to ord22:45
*** mriedem has quit IRC22:45
corvusi just did a local ls on my workstation22:45
corvusit initially took a while to realize that the server was down, but then apparently switched to a replica successfully22:46
fungiyeah, it seems like it's now no longer breaking even with afs01.dfw back offline again22:46
clarkbshall I reboot then?22:46
fungigo for it22:46
corvusi'm even able to load tc governance documents via docs.o.o, which are almost certainly not in any cache.  ;)22:47
fungii'm continuing to browse around to random pages on the docs site22:47
clarkband reboot succeeded as expected22:47
clarkbI'm going to put afs01.dfw.o.o and afsdb01.o.o back into puppet22:48
corvussorry i missed the opportunity to help debug earlier22:48
*** hwoarang has quit IRC22:48
clarkbthen release my mirror-update locks and we can merge https://review.openstack.org/#/c/642563/ if anyone else wants to be second +2 on that22:48
corvusmaybe shout "infra dash root" if you need more eyes next time :)22:48
fungiit was definitely a strange situation22:49
*** hwoarang has joined #openstack-infra22:49
fungijohnsom: stuff should be back to normal as of ~22:40z22:50
fungiso roughly 10 minutes ago22:50
mordredcorvus: I agree with your symptom from earlier of only sometimes getting the opendev logo on opendev.org22:50
*** threestrands has joined #openstack-infra22:50
johnsomYeah, works for me now22:51
mordredcorvus: I have no idea if it's service worker related - or whatnot - it's definitely 'interesting'22:51
clarkbI have released all of the locks on mirror-update after vos release for docs et al ran on afsdb0122:52
clarkblast remaining step is to merge https://review.openstack.org/#/c/642563/1 if infra-root can review that22:52
fungii think it got double-approved22:53
corvusmordred: i deleted my localhost:3000 service worker and it fixed my local gitea22:54
corvusi don't consider that to be a viable production fix, so i have not yet done that for opendev22:54
corvusbut i do think that points very strongly in the direction of service workers22:54
mordredyeah22:54
fungiis gitea storing different things in service workers than they're designed to be used for?22:55
fungiseems really strange for that to affect logs/branding22:56
fungier, logos/branding22:56
mordredcould have something to do with how those things are being bundled?22:56
*** tkajinam has joined #openstack-infra22:56
clarkblooks like we do all our mirror update crons on even numbered hours so 0000UTC is when they will next run22:58
clarkbI'll try to keep an eye on that in an hour22:58
clarkbbut as far as I can tell the afs fileservers are upgraded so I've moved them to the done section \o/22:59
clarkbif anyone wants to sort out the afs db server process I'm happy to help. I tried to dig into the process for that and couldn't really come up with anything good that didn't involved a proper outage22:59
clarkbI'll probably start collecting info on an etherpad for that tomorrow so we can get started on something23:00
*** TheJulia has joined #openstack-infra23:00
clarkb#status log Upgraded afs01.dfw, afs02.dfw, and afs01.ord to Xenial from Trusty23:01
openstackstatusclarkb: finished logging23:01
*** eernst has joined #openstack-infra23:05
*** dustinc has joined #openstack-infra23:07
*** eernst has quit IRC23:10
*** rascasoft has quit IRC23:12
openstackgerritMerged openstack-infra/project-config master: Revert "Disable wheel mirror updates for afs server upgrades"  https://review.openstack.org/64256323:14
*** mattw4 has quit IRC23:16
clarkbcorvus: one thing I notice on afs01.dfw that isn't the case on afs02.dfw or afs01.ord is that /etc/openafs/ThisCell is a symlink. We seem to try to set it with puppet and get Mar 11 23:16:39 afs01 puppet-user[3501]: (/Stage[main]/Openafs::Client/File[/etc/openafs/ThisCell]) Ensure set to :present but file type is link so no content will be synced23:27
clarkbcorvus: to make that message go away can I safely copy the target of that symlink over the symlink?23:28
clarkbthis isn't a regression due to the upgrade so not urgent23:28
corvusclarkb: i can't imagine that would be a problem, and it's not ringing a bell, so if we did that manually at some point (as opposed to some package install script somewhere) i can't recall.23:29
ianwclarkb: it being a symlink sounds maybe like something debconf would do in the interactive install case?23:31
clarkbianw fungi fyi I abandoned https://review.openstack.org/#/c/641880/23:32
clarkbianw: maybe? its a symlink to the server/ThisCell file23:32
clarkbthe content between those files is the same on the other servers just not a symlink23:32
ianwzbr: tried to summarise what i understand to be the issue and suggested maybe abstraction of the name would be clearer.  very interested what fungi thinks23:33
fungiianw: where was this? on the etherpad?23:34
ianwfungi: https://review.openstack.org/#/c/639951/23:35
fungiahh, thanks23:36
*** threestrands_ has joined #openstack-infra23:42
*** threestrands has quit IRC23:45
corvusfungi: i responded on https://review.openstack.org/642574 and +W.  is that okay?23:45
fungiyep, fine by m,e23:46
fungier, by me23:46
corvusmy plan was to re-key the intermidate registry today after that merged23:47
corvusmy plan is now to re-key the intermediate registry tomorrow23:47
corvusthen we can consider that to be in production23:47
fungisounds great23:47
*** rascasoft has joined #openstack-infra23:52
openstackgerritMerged openstack-infra/zuul-jobs master: Add no_log entries to skopeo copy commands  https://review.openstack.org/64257423:57

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!