Friday, 2017-12-15

*** jkilpatr has joined #openstack-sprint00:00
pabelangerokay, fixed00:01
pabelangerwell, manually fixed00:01
clarkbjeblair: once 528101 is merged I am going to clean up the testing of my change on es05-07 and abandon that change00:01
pabelangerwe need to update our vhost00:01
pabelangerlet me get up a aptch00:01
pabelangerpatch*00:01
pabelangerremote:   https://review.openstack.org/528132 Add javascript alias to cacti.o.o for xenial00:04
pabelangerjeblair: clarkb: ianw: fix for cacti02.o.o^ missing alias for javascript folder00:04
pabelangerjeblair: thanks for pointer, would have been waiting a while for it to update00:05
pabelanger:)00:05
pabelangerokay, stepping away for a bit, will check back to make sure firewall rules are happy00:09
pabelangercool, starting to see data show up on cacti02.o.o now00:29
pabelangerI'll delete cacti01.o.o in the morning if nobody objects00:29
pabelangerEOD00:29
*** jkilpatr has quit IRC00:58
clarkbianw: can I get a second review on https://review.openstack.org/#/c/528101/ ? that will allow elasticsearch upgrades to proceed00:59
jeblairpabelanger: oh wow that's a crazy new tree thing.  yep, data looking good.  thanks!01:01
ianwarrghh slight stuff up in the ordering of /var/www in the puppet-hound module ... fixing and adding a test.02:32
*** harlowja has joined #openstack-sprint02:41
*** harlowja has quit IRC02:42
*** harlowja has joined #openstack-sprint02:43
*** harlowja has quit IRC03:15
*** harlowja has joined #openstack-sprint04:36
ianwclarkb: http://logs.openstack.org/30/528130/4/check/legacy-puppet-beaker-rspec-infra/58ed957/job-output.txt.gz#_2017-12-15_04_34_45_88831604:38
ianwit's not looking great for etherpad ... looks like a bunch of version stuff to work through04:39
*** harlowja has quit IRC04:45
ianwhttps://review.openstack.org/528156 starting to bump everything, not sure how far i'll get with it04:50
clarkbianw: probably need to update etherpad version05:05
ianwyeah, trying that, i've put a basic rspec test in there, so it's a start05:06
ianwok, codesearch01 is alive!05:09
*** harlowja has joined #openstack-sprint05:21
*** harlowja has quit IRC05:22
*** harlowja has joined #openstack-sprint05:24
*** harlowja has quit IRC05:29
*** harlowja has joined #openstack-sprint05:38
ianwfrickler: after 17 revisions, i *think* ethercalc is ready to go06:03
ianwi've update the changes required in the etherpad.  if you want to shepherd them through, then launch the node, should be gt06:05
ianwgtg06:05
*** harlowja has quit IRC06:32
fricklerwow, so much backlog ...07:35
fricklerlooks like es03+04 are still untouched, so I'm going to launch a new es04, verify the firewall patch there locally, merge it and watch everything go boom07:39
frickleruhoh, seems we need to take care of cleaning up nodes first, not enough quota left to start even one additional small node ... Requested 61440, but already used 1094656 of 1152000 ram07:42
fricklerthat's 4G missing07:43
fricklerso I found an instance of subunit-worker01.openstack.org in error state, 140991b2-b376-4990-aed5-a07ffeb94ea6, launched earlier than the currently running 540b860a-ed52-4307-99ca-9f51103ae3f208:29
fricklerI'm going to remove the errored one, hoping that that will be enough of a quota cleanup08:30
fricklerthat worked fine, confirmed fw patch, merging now08:48
fricklerianw: major issue with https://review.openstack.org/528156 seems to be lacking systemd service definition, will try to fix that once I'm done with elasticsearch08:56
fricklero.k., so new es04 looks sane to me, starting with migration tasks now10:50
*** jkilpatr has joined #openstack-sprint11:38
fricklernew es04 is active, syncinc shards, bbiab11:40
*** jkilpatr has quit IRC11:51
*** jkilpatr has joined #openstack-sprint11:52
*** jkilpatr has quit IRC12:07
fricklercluster is green, removing old es04 now and launching new es0312:35
frickleres03 now syncing. puppet runs on trusty nodes seems to take a long time, scanning all es data, won't investigate further as that will soon no longer affect us13:21
*** clayton has quit IRC13:29
*** baoli has joined #openstack-sprint14:21
*** clayton has joined #openstack-sprint14:22
fricklerold es03 deleted, new es02 launched, waiting for clarkb to start the grand finale now ;)14:29
clarkbfrickler: to do es02 we just have to update the apache proxy config on logstash.openstack.org14:43
clarkbright now we point it as es02 but can point it to any of the others14:43
clarkbfrickler: you were verifying that https://review.openstack.org/#/c/528101/ applied properly? I am assuming so as you approved the change14:51
fricklerclarkb: yes, I did. causes a different workflow for changing hosts now, needing a puppet run instead of just an iptables restart14:54
clarkbfrickler: but seems to be working ok?14:55
fricklerclarkb: yep, except for the long run time on the old nodes I noted earlier, but that should now be obsolete anyway14:56
fricklerfor changing logstash, is that the "discover_node" entry?14:57
clarkbfrickler: yes in puppet-logstash/templates/kibana.vhost.erb we do something like RewriteRule ^/elasticsearch/((.*/)?_search)$ http://<%= @discover_nodes[0] %>/$1 [P]14:58
clarkbfrickler: so I think changing the first element of that list will be what we want14:59
pabelangermorning15:09
clarkbgood morning15:10
jeblairgood morning!15:10
*** harlowja has joined #openstack-sprint15:11
*** harlowja has quit IRC15:13
pabelangerI confirmed with ttx tht design-summit-prep can be deleted15:14
pabelangerI'll do that shortly15:14
fricklerclarkb: https://review.openstack.org/528305 Prepare for replacing elasticsearch0215:15
pabelangerremote:   https://review.openstack.org/528306 Delete design-summit-prep node15:17
pabelangereasy review for people15:17
pabelangerI'll delete the server and dns now15:17
pabelangercacti02.o.o looks to be setup correctly, I'll delete cacti01.o.o now unless people object15:35
clarkbno objection here15:36
clarkbpabelanger: maybe you can be second review on https://review.openstack.org/#/c/528305/ so that frickler can finish up the elasticsearch cluster upgrades15:36
pabelangerlooking15:37
pabelanger+315:37
clarkbgeneral note to the channel, can you try and make sure the etherpad is up to date with the work you did around the sprint this week beforeyou sign off and weekend? I will use that to put together an email summary of what we did15:38
pabelangerhttps://review.openstack.org/528133/1 and https://review.openstack.org/528135/ are also some easy views for clean up of system-config15:38
pabelangerreviews*15:38
pabelangerI'll start looking into static.o.o, but that will likely need some sort of announcement for an outage15:42
pabelangerand we likely don't want to roll that out today on a friday15:42
clarkbpabelanger: ya we'll want to look at scheduling the more difficult upgrades around feature/freeze release15:43
clarkb(as I expect we won't be getting much done after this week simply due to holidays and all that and then its into last milestone and feature freeze and all that fun)15:44
pabelangeragree15:46
pabelangerlet me see what else we could finish off today before looking at static.o.o15:46
pabelangerI'm going to read up on how to migrate kerberos15:48
clarkbpabelanger: see comment on 528135 (mostly just looking to see what others think)15:48
clarkbpabelanger: I wrote docs on how to do no downtime kerberos reboots. I expect that will come in to play a little15:49
pabelangerclarkb: yah, I seem to remember we did that before with kerberos15:49
pabelangerokay, so I think we maybe stand up kd04, as new standby. Join it to kdc01 and kdc02, confirm it works, then offline kdc02.15:51
pabelangerrun run-kprop.sh to make kdc04 primary, stand up kdc03 (xenial primary), join to kdc04, delete kdc01, then run-kprop.sh again so kdc03 is new final primary15:52
pabelangerclarkb: ^seem right?15:52
clarkbpabelanger: it sounds good, but I have little to go on as far as previous experience to know if it is right :)15:53
clarkbjeblair: ^ is probably hte best person to get input on that15:53
pabelangersame15:53
pabelangerlet me propose puppet patches first to stand up xenial but not join15:56
pabelangerthen we can finalize order15:56
clarkbok15:57
jeblairclarkb, pabelanger: some dns changes may be necessary too?16:11
jeblairhuh, i think we may have forgotten to document those16:12
jeblairi'll catalog those real quick16:15
pabelangerokay, thanks16:22
fricklerclarkb: so I can run the usual puppet apply on logstash.o.o?16:22
jeblairpabelanger, clarkb: remote:   https://review.openstack.org/528323 Add kerberos / afs dns info16:24
jeblairpabelanger, clarkb: if you want to change hostnames, we'll need to update dns16:25
clarkbfrickler: there is a playbook you can run on puppetmaster that does it16:26
clarkbfrickler: sorry I am dealing with kids right now but will be back at computer soon16:26
fricklerclarkb: o.k., I'll be back in about an hour then, too16:28
pabelangerremote:   https://review.openstack.org/528328 Add kdc04.o.o xenial node16:40
pabelangerokay, as I understand, we shouldn't have an issue with multiple slave KDCs with our master16:43
jeblairpabelanger: i think so17:10
clarkb frickler we use the remote_puppet_adhoc.yaml playbook. It runs by default using hosts * so you have to use ansible-playbook --limit some.fqdn.here to restrict it to just the host you want17:10
clarkbfrickler: the other thing to keep in mind is it does not update the puppet modules and other git repos17:10
jeblairclarkb: doesn't kick.sh do that?17:10
clarkboh kick.sh may17:11
* clarkb checks17:11
jeblairthat's what i usually use17:11
clarkbno kick.sh is basically just the same thing as above so I don't think the git repos update17:11
jeblairthat's what i meant.17:11
jeblairirc lag17:11
clarkbgotcha ya kick.sh makes it simpler to use17:11
clarkbbecause you just pass the hostname and it does the limit for you17:11
jeblairyep17:12
clarkbfrickler: so system-config/tools/kick.sh some.fqdn.here is simpler17:12
pabelangerclarkb: mind a review on https://review.openstack.org/528328/ and parent https://review.openstack.org/52831917:26
pabelangerI believe that will allow us to bring a new slave online17:26
clarkbpabelanger: what points afs at kdc*.openstack.org? Might be getting ahread of myself but not seeing that in the change to add the new kdc (or in system-config otherwise)17:31
pabelangerclarkb: we'll need some followup DNS changes to bring kdc04.o.o online in dns, so AFS could see it17:34
clarkbright but what tells afs to look at kdc04 too? is that just afs config?17:35
pabelangerclarkb: that should be https://review.openstack.org/528323/17:35
pabelangerclarkb: I belive it is just resolves it via dns17:35
pabelangerbut I will confirm17:35
clarkboh kerberos uses srv records perfect17:35
pabelangeryah17:36
clarkbpabelanger: jeblair looks like zuul.opestack.org is still up and running is that so that it can redirect to zuulv3.openstack.org?17:37
pabelangeryah, fungi suggested we might be able to just update DNS now and delete zuul.o.o17:38
pabelangerthen discuss moving zuulv3.o.o back to zuul.o.o in the future17:39
*** baoli has quit IRC17:48
fricklerclarkb: o.k., so the config seems to have been applied in the meantime, I'd assume I could go on replacing es02 now18:01
*** baoli has joined #openstack-sprint18:03
clarkbfrickler: go for it18:03
pabelangerokay, trying kdc04 in ord same location as kdc0218:19
fricklero.k., new es02 running, waiting on 2 shards. logstash.o.o seems to be doing fine18:28
clarkbfrickler: woot18:28
clarkbfrickler: if you need to run I can remove the old es02 when cluster goes green18:36
clarkbfrickler: you don't need to hold off on your weekend for that :) thank you for all the help this week!18:36
*** baoli has quit IRC18:42
*** baoli has joined #openstack-sprint18:46
fricklerclarkb: thx, deleted old server now. have a nice weekend, everyone (though I'll probably be back tomorrow anyway ;)19:05
pabelangerlooks like some systemd issues with kerberos, looking now19:16
fungiyeah, my primary concern with deleting the old zuul.o.o instance is if anyone has anything in their homedirs/shell histories they want to grab first19:29
fungii doubt we still care about the logs on it at this point (odds are they've been rotated into oblivion by now anyway)19:30
*** jkilpatr has joined #openstack-sprint20:21
*** baoli has quit IRC21:27
clarkbI shouldn't have anything in my homedir on any of our hosts that I care about22:20
*** dteselkin has quit IRC22:31
*** dteselkin has joined #openstack-sprint22:39
clarkbI've clearned out the infra netfilter persistent unit on the 3 elasticsaerch nodes that got it. Goign to abandon that change then work on an email summarizing what we did22:39
fungithanks for summarizing! i'll fix up the irc topics now23:21
*** ChanServ changes topic to "OpenStack Virtual Sprints, schedule at https://wiki.openstack.org/wiki/VirtualSprints | Channel logs at: http://eavesdrop.openstack.org/irclogs/%23openstack-sprint/"23:22
*** rwsu has quit IRC23:37

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!