openstackgerritMike Burns proposed openstack/tripleo-heat-templates: change default compute hostnames to compute  https://review.openstack.org/30434200:03
openstackgerritDan Radez proposed openstack/tripleo-heat-templates: Enable deployment of Ceph Storage (OSD) on the Compute Nodes  https://review.openstack.org/27375400:11
openstackgerritJiri Stransky proposed openstack/tripleo-heat-templates: change default compute hostnames to compute  https://review.openstack.org/30434208:34
shardymarios: Hey, wanted to chat about https://review.openstack.org/#/c/304342/108:37
shardySo, it seems we have a downstream patch which names the nodes compute, not novacompute08:37
shardybut if we switch that now, anyone upgrading between the upstream versions will be broken, right?08:37
shardybecause it's always been novacompute upstream08:38
*** coolsvap has joined #tripleo08:44
ccamachojaosorior, good morning :), speaking about broken gates... I'm trying to debug the Jenkins tests for a tripleo.sh patch.. And i'm not able the get the error... I mean... Is not working fine, right?08:45
ccamachoi.e the upgrades, ha, non ha jobs..08:46
jaosoriorccamacho: Yep, it's all red08:46
jaosoriorseems to me that it's broken08:46
jaosoriortrying to dig out what's up08:46
ccamachoI will like to debug it also... but not sure where to start08:46
ccamachofor example.. http://fpaste.org/354547/46045084/08:47
jaosoriorfor isntance08:47
jaosoriormost of the failures I've been looking at in the past minutes are pretty similar08:47
jaosoriorin the sense that08:47
jaosorior2016-04-12 02:31:52.621 | | Controller             | 77046b52-c8d1-4540-9b46-ba318171d744 | CREATE aborted         | CREATE_FAILED      | 2016-04-12T02:31:32 |08:47
jaosoriorand then the overcloud ends up timing out becausae of that08:48
*** gfidente has joined #tripleo08:48
*** gfidente has quit IRC08:48
*** gfidente has joined #tripleo08:48
mariosshardy: o/ reading08:48
mariosshardy: yes, this is a good point. this 'workaround' was originally going to be delivered with an environment file during upgrades08:49
*** paramite is now known as paramite|afk08:49
shardymarios: Yeah having ComputeHostnameFormat obviously would work around it08:50
mariosshardy: well, i think the point is to get this into liberty08:50
mariosshardy: because its an upgrades thing08:50
shardymarios: Yeah, if we get it into liberty and mitaka then it removes the problem for new deployments08:50
mariosshardy: but it still istn' the right fix, because ultimately you will have to continue to set that always and forever amen08:50
shardybut anyone who already deployed them cannot update08:50
jaosoriorccamacho: For instance, I see a bunch of errors in the ironic log like this: Client-side error: Node 8cef9838-3298-4203-a133-de298c4b68d5 is locked by host localhost.localdomain, please retry after the current operation is completed.08:50
shardyI know given the preliminary state of upgrades upstream we've not made huge commitments re upgrades (yet), but I'm worried about RDO fallout08:51
*** paramite|afk is now known as paramite08:51
shardyso we probably need to at least get trown's input on it when he wakes up08:51
shardyalso if we do backport it, we need to test the impact of changing this on an existing deployment via update08:51
shardye.g it clearly breaks upgrades, but will we break all existing users if they do another overcloud deploy to e.g apply a config change?08:52
ccamachojaosorior, I will try to find something locked.. I will let you know if I manage to find it08:52
mariosshardy: ack.i need to still test the explicit setting workaround ... hopefully we can understand more about what cloud-init is doing and if we're using it wrong or it needs a fix.08:52
ccamachojaosorior, using your clues BTW :)08:52
shardymarios: Yeah the name can be updated, but cloud-init won't run again:08:53
mariosshardy: thanks for ping these concerns make sense wrt upgrades (I thought well get it into liberty so np, but if you've already got a 'overcloud-novacompute' then upgrading to 'overcloud-copmute' is the same problem in reverse to https://bugzilla.redhat.com/show_bug.cgi?id=132473908:54
openstackbugzilla.redhat.com bug 1324739 in openstack-tripleo-heat-templates "Duplicate nova hypervisors show up in nova hypervisor-list post 7.3 -> 8 upgrade" [Urgent,On_dev] - Assigned to mandreou08:54
shardymarios: Yeah, that's my worry - and we do have RDO users of TripleO now to consider08:55
mariosshardy: so do you think we first need to do a stack update to rename our compute nodes08:55
mariosshardy: and then we can run the upgrade08:55
shardymarios: I don't think a stack update will rename them08:56
shardybut yeah we at least need to test that08:56
mariosshardy: oh i thought that's what you meant - ok08:56
shardymarios: the stack update will *try* to update them, but I think we'll end up in an inconsistent, possibly broken state08:56
shardybecause the OS::Nova::Server "name" will get updated, but then nothing will make cloud-init adjust the node hostname08:57
mariosshardy: well, that is essentially what is happening here no? it is a stack update with the new templates, which hav ecompute hostname change08:57
mariosshardy: and as you say, even though nova knows the new node as 'overcloud-novacompute' and it is set as such in /etc/hosts on all nodes08:57
mariosshardy: cloud-init keeps setting /etc/hostname to 'overcluod-compute-0'. I had to explicitly disable cloud-init to get the 'new' name to stick on a reboot08:58
shardymarios: Yeah, it's picking up the old nova user/metadata I guess09:00
shardyI didn't realize it runs every boot tho, I thought preserve_hostname was the default09:00
mariosshardy: we don't have that set, i had to set it09:00
shardymarios: ack, hmm, tricky - so whatever we do, changing the default is a problem unless we explicitly don't support folks with existing deployments from the upstream branches09:03
shardyOr we figure out a way to collect the new hostname from nova and make it stick09:03
mariosshardy: yeah that's the real fix09:03
*** julim has quit IRC09:04
*** bvandenh has joined #tripleo09:04
*** julim has joined #tripleo09:05
gfidenteso there is a bug with cloud-init here too, where it runs at every boot09:06
gfidentewhen it shouldn't09:06
gfidenteI think marios pointed that out in the BZ09:06
jistrbandini: discovered we have openstack services still running after `pcs resource disable openstack-core` -- openstack-ceilometer-notification-clone, openstack-sahara-engine-clone, openstack-aodh-listener-clone -- should we hook those to openstack-core too?09:08
jistrspecifically not having openstack-ceilomenter-notification-clone stopped seems to cause crm_resource --wait to hang forever when stopping openstack-core09:08
bandinijistr: could you send me a CIB of this system? Is this master (aka newton or mitaka)(09:09
shardygfidente: Yeah, and despite running every boot it always uses the old locally stored nova metadata09:09
jistryes, it's master09:09
* bandini fighting with failed deployments :/09:10
shardygfidente, marios: If we can do preserve_hostname I wonder if we can have another script which pulls the nova metadata (I think the hostname is reflected in the ec2 metadata?) and updates the hostname on boot if it changes09:10
shardyyou still have to reboot the computes then tho09:10
jistrbandini: CIB http://chunk.io/f/64955f08cc644639bb8e2bb9fbc9549c09:11
mariosshardy: yeah like a firstboot. we can even set/deliver/create that during the 'upgrades init' step - we already have a softwareconfig there that delivers the upgrade script to computes09:12
shardymarios: ah, could work then09:12
* shardy looks to see if the hostname is updated by nova in the metadata09:13
mariosjistr: gfidente: o/09:13
mariosbandini: o/09:13
jaosoriorccamacho: There also seems to have been a problem with swift09:13
marioso/ jaosorior ccamacho09:13
* marios coffeee09:13
jistrbandini, rasca: the crm_resource --wait i discussed with rasca yesterday on rhos-pidone is caused by pending actions to start heat-engine and heat-api-cloudwatch after disabling openstack-core. I don't know why that is happening, i think it's a bug in pacemaker constraint resolution. But if i hook ceilometer-notification after openstack-core, so that it gets stopped too, the pending start actions for those heat services don't appear.09:13
jaosoriormarios: Hey dude09:14
jistrbandini, rasca: funny thing is that heat-api also depends on ceilometer-notification and openstack-core, but it doesn't have a pending start action -- that's why i think it's a bug in constraint propagation/resolution09:14
jaosoriorccamacho: I see09:14
jaosoriorApr 12 08:20:59 instack.localdomain glance-api[17675]: 2016-04-12 05:20:59.325 17747 ERROR swiftclient Traceback (most recent call last):09:14
*** akrivoka has joined #tripleo09:14
jaosoriorApr 12 08:20:59 instack.localdomain glance-api[17675]: 2016-04-12 05:20:59.325 17747 ERROR swiftclient   File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 1565, in _retry09:14
jaosoriorApr 12 08:20:59 instack.localdomain glance-api[17675]: 2016-04-12 05:20:59.325 17747 ERROR swiftclient     service_token=self.service_token, **kwargs)09:14
jaosoriorApr 12 08:20:59 instack.localdomain glance-api[17675]: 2016-04-12 05:20:59.325 17747 ERROR swiftclient   File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 929, in head_container09:14
jaosoriorApr 12 08:20:59 instack.localdomain glance-api[17675]: 2016-04-12 05:20:59.325 17747 ERROR swiftclient     http_response_content=body)09:14
jaosoriorApr 12 08:20:59 instack.localdomain glance-api[17675]: 2016-04-12 05:20:59.325 17747 ERROR swiftclient ClientException: Container HEAD failed:
jaosoriorApr 12 08:20:59 instack.localdomain glance-api[17675]: 2016-04-12 05:20:59.325 17747 ERROR swiftclient09:15
jaosoriorand after that a bunch of glance errors09:15
jistrmarios: o/09:15
bandinimarios: o/09:15
jaosoriorI guess cause it tries to fetch that image09:15
ccamachojaosorior, I see https://s-media-cache-ak0.pinimg.com/736x/f0/56/46/f056465cb8a29fab9e796f73a543d6a0.jpg09:16
jaosoriorI should brew some coffee too09:16
jaosoriornow that I think about it09:16
rascajistr, ok so the point is: stated that we have a bug but the developers themselves does not suggest to rely crm_resource, what is the best way to achieve the result we want?09:17
*** openstackgerrit has quit IRC09:17
bandinijistr: so openstack-ceilometer-notification-clone was never been dependant on keystone before, while openstack-sahara-engine-clone and openstack-aodh-listener-clone are new09:19
rascabandini, openstack-ceilometer-notification-clone WAS dependent from keystone, as you can see from the schema in our gdoc09:21
jistrrasca: i think we were suggested to use crm_resource about 3/4 months ago, no? if the suggestion now changed to the opposite, i'm fine with that but i'd like to at least understand what's the issue. I think we have either broken constraints, or a bug in pacemaker, or both. I don't like that we're trying to be so quick to dismiss a solution that worked perfectly until now, without first properly investigating what is the problem in the first09:21
derekh
rascajistr, I totally agree, even if the fact that it worked until now I don't think must be the only thing to consider while choosing to change something09:23
bandinirasca: do you have a link handy? that is not what I see from my docs09:23
bandinijistr: agreed we first need to get to the bottom of this issue fully09:24
rascabandini, https://docs.google.com/document/d/1aXQ07CNazxt6xWbegIfYlP4VyZL3Dc9ZIkI8bRiB6rc/ if you look at the first schema09:24
*** coolsvap has joined #tripleo09:24
jaosoriorderekh: Been trying to look into it09:25
jistrrasca: yeah agreed, my point wasn't just about "worked until now", it was more about "worked until now and we still have no idea why it stopped working"09:25
bandinirasca: it is not what is deployed by tripleo09:25
jaosoriorderekh: But haven't figured out much... all I know is that ironic gets some error related to it not being able to do an operation because a node is locked. Then swift not being able to write an image (for some reason) and because of that, glance getting a bunch of resource-not-found's09:26
derekhjaosorior: I've just lost access to the cloud (in the last 10 minutes), trying to get back on now09:26
jaosorioroh, even that is down now?09:26
rascabandini, in osp8 there is start openstack-ceilometer-alarm-notifier-clone then start openstack-ceilometer-notification-clone09:26
rascabandini, so IT IS dependent09:27
rascabandini, maybe things changed while inserting openstack-core09:27
derekhjaosorior: yup, although if you want to keep looking into what ever the problem was befor this fire ahead09:27
derekhjaosorior: it might still be a problem when the cloud is back up09:27
rascabandini, but that constraint make a dependency like the one I have in my docs09:27
bandinirasca: no likely that was before mitaka (http://acksyn.org/files/tripleo/wsgi-2016-02-24-cib.pdf is *pre* openstack-core)09:28
jistrbandini, rasca: re constraints on those openstack services -- i think the issue is that they're not really hooked to any dependencies whatsoever. E.g. i'd imagine they rely at least on RabbitMQ, but they're not hooked to that either. So if we stop Rabbit, ceilometer-notification will still keep running.09:28
bandinijistr: yes we need to tweak them for the full-pacemaker architecture for sure09:29
* bandini mumbles something about lightweight arch09:29
gfidentebandini what do we do with cinder-volume in lightweight?09:31
jistryeah i thought for full arch we aimed for "backend services -> openstack-core -> openstack services" type of dep, so it would seem logical to me to hook those 3 services after openstack-core09:31
bandinigfidente: that stays A/P (http://acksyn.org/files/tripleo/light-cib.pdf - note that I removed mongod as well from the last templates)09:33
gfidentebandini jistr that conversation abou migrations09:34
bandinijistr: agreed (http://acksyn.org/files/tripleo/newton-jistr-2016-04-12.pdf  -> we need to hook up those three services)09:34
bandinigfidente: aye we need to get closure there too :/09:34
gfidenteI wonder if it isn't easy to just remove all the services/constraints and let puppet run with the updated manifest where it starts them as needed09:34
jistrhmm the question is when and where you let puppet run09:35
gfidenteafter we removed the constraints09:35
gfidentebut this is to migrate to lightweight09:36
jistrand what about the other effects of puppet, like the example i keep bringing up -- from 7 to 8 a rabbitmq passwd change is very likely to happen09:36
jistrand if you run puppet only on controllers, it will break rabbitmq comms with other nodes09:36
jistrand if you run puppet everywhere, then you're running new templates on un-upgraded nodes09:37
jistrthe problem is that puppet is trying to do the full converge of the whole cloud, while we cannot let that happen just yet when some nodes haven't been upgraded yet, IIUC09:38
gfidenteI want to let puppet do that indeed09:38
jistrso we could do that migration with puppet, but not with t-h-t manifest. It would have to be specifically crafted mini-manifest just for the migration, to avoid applying unwanted changes alongside the migration.09:38
gfidenteI actually liked the yum_update approach09:39
gfidentego in maintenance, remove constraints or services as needed09:39
gfidenterun yum update09:39
gfidenterun puppet and let it bring back in known state09:39
jistryeah, but IIRC there puppet ran also only after *all* nodes have finished the minor update09:40
jistrif we do it with t-h-t templates then it could be option 4 here https://etherpad.openstack.org/p/tripleo-migrations09:42
*** mgould has joined #tripleo09:42
gfidentejistr well 3 to me09:44
jistrgfidente: or if you are suggesting we do the migration in a minor update instead of major update, then it's option 1 in that etherpad09:44
jistrgfidente: well that doesn't work very well AFAICT, see line 67 in the etherpad09:44
gfidenteyou mean it doesn't work very because the rpms on the computes could be from the previous release when trying to run puppet?09:46
aparnavHey, Can someone take a look at this patch https://review.openstack.org/#/c/295203/ ?09:46
jistrgfidente: yea. The new puppet modules and t-h-t might contain things specific to the new release.09:47
*** apetrich has joined #tripleo09:47
jistri don't think we can guarantee that we can safely run mitaka tht&puppet on top of liberty RPMs09:47
jistrand also, given our current distribution model of puppet modules09:48
gfidentebut we can update rpms on all nodes though09:48
gfidentelike yum_update did09:48
jistrin theory we could but we got an explicit requirement not to tamper with all computes at once09:49
jistri didn't ask why, but i assume the reason were concerns that if something goes wrong during a mass-upgrade of computes, we might in theory bring down the whole cloud incl. all running workloads09:51
*** shardy has quit IRC09:52
jistrgfidente: btw even yum_update didn't just remove things and wait for puppet to add the new ones. yum_update also added the new constraints too because we had to keep a good cluster state between yum_update.sh and puppet too09:53
*** shardy has joined #tripleo10:02
*** jcoufal has quit IRC10:03
*** miles has joined #tripleo10:03
*** miles is now known as mgould_10:04
*** mgould has quit IRC10:04
*** bvandenh has quit IRC10:05
openstackgerritMerged openstack/tripleo-quickstart: Add libselinux-python to install_deps  https://review.openstack.org/30416710:06
*** ccamacho is now known as ccamacho|lunch10:21
*** jaosorior has left #tripleo10:21
*** jaosorior has quit IRC10:21
*** jaosorior has joined #tripleo10:22
openstackgerritImre Farkas proposed openstack/tripleo-docs: Document ready-state configuration  https://review.openstack.org/29948110:29
openstackgerritMerged openstack/tripleo-ui: Update license string to use SPDX Identifier  https://review.openstack.org/30184910:30
*** ccamacho|lunch is now known as ccamacho10:38
openstackgerritJiri Tomasek proposed openstack/tripleo-ui: Move Environment and Parameters config to single modal  https://review.openstack.org/30227210:45
openstackgerritJiri Tomasek proposed openstack/tripleo-ui: Move Validations to right sidebar  https://review.openstack.org/30412710:45
openstackgerritJiri Tomasek proposed openstack/tripleo-ui: Deployment Plan page updates  https://review.openstack.org/30395810:45
*** lblanchard has joined #tripleo11:03
*** jcoufal has joined #tripleo11:07
*** ramishra has joined #tripleo11:11
*** ramishra_ has joined #tripleo11:14
*** andrearosa has joined #tripleo11:18
*** ramishra_ has quit IRC11:22
*** ramishra_ has joined #tripleo11:22
*** ramishra_ has quit IRC11:23
openstackgerritwes hayutin proposed openstack/tripleo-quickstart: add ipv4 network-isolation to quickstart for virt deployments  https://review.openstack.org/30303011:25
*** mgould_ has quit IRC11:25
*** ramishra_ has joined #tripleo11:27
openstackgerritJames Slagle proposed openstack/tripleo-heat-templates: Add missing ManagementIpSubnet  https://review.openstack.org/30126611:32
openstackgerritJames Slagle proposed openstack/tripleo-heat-templates: Add net-config-static.yaml  https://review.openstack.org/30126711:32
jaosoriorslagle: CI is broken :/11:32
openstackgerritJames Slagle proposed openstack/os-net-config: Add support for OVS tunnels  https://review.openstack.org/30421511:33
jaosoriorso, no use rechecking ATM11:33
*** mcornea_ has joined #tripleo11:33
openstackgerritJames Slagle proposed openstack/os-net-config: Fix typos  https://review.openstack.org/30421911:34
slaglejaosorior: yea, i know that11:34
slaglewas rebasing11:34
jaosoriordidn't see your nick in this channel, that's why I repeated on the other one11:35
*** panda has joined #tripleo11:42
*** MaxPC has joined #tripleo11:45
derekhslagle: bnemec rh1 down, brought it back up, instances now not getting ipaddresses11:49
bandinidtantsur: ever seen this? http://fpaste.org/354619/60462136/ I get this during introspection on a mitaka BM env. I am moderately sure it worked like a month ago (aka last time I tried)11:56
bandiniI will look into it more, but maybe it rings a bell11:57
dtantsurbandini, looks like https://bugzilla.redhat.com/show_bug.cgi?id=132289211:58
bandinidtantsur: yep that's the one, let me look at it more in detail after some coffee ;)12:04
dtantsurbandini, tl;dr: there is a fix in newton, but it can't be directly backported for mitaka12:04
dtantsuror do we have like builds of images from master?12:06
trownbandini: ya I am contemplating just building IPA from master for all RDO releases until there is some issue that breaks12:06
EmilienMtrown: details? anything related to puppet? we merged lot of stuff monday12:06
trowndtantsur: not yet, working on that today12:06
EmilienM(in puppet modules)12:06
trownEmilienM: check topic, CI cloud is down12:06
dtantsurtrown, great! then we can allow people to choose even: use IPA stable or IPA master12:06
dtantsuryour tripleo-quickstart could have an option for that12:06
*** mgould has joined #tripleo12:07
trowndtantsur: suppose so, the option would have to download the non-default image, but it is doable12:07
sshnaidmtrown, hi12:11
trownsshnaidm:  hi :)12:11
sshnaidmtrown, do you know ways to connect overcloud directly without creating 3 tunnels? I mean auth_url or even vms on it12:12
*** ramishra_ has quit IRC12:12
trownsshnaidm: larsks wrote some docs for tripleo-quickstart on that https://github.com/openstack/tripleo-quickstart/blob/master/docs/accessing-overcloud.md12:12
trownthey should be mostly relevant to other environments12:13
*** lblanchard has joined #tripleo12:13
sshnaidmtrown, great, thanks12:13
*** ramishra_ has joined #tripleo12:14
*** ramishra_ has quit IRC12:14
openstackgerritCarlos Camacho proposed openstack-infra/tripleo-ci: Removing previously created resourses in pingtest  https://review.openstack.org/30456012:17
*** lblanchard has quit IRC12:18
*** ramishra_ has joined #tripleo12:19
derekhdprince: slagle: bnemec rh1 down, brought it back up, instances now not getting ipaddresses, if anybody has any idea jump in and poke around12:20
dprincederekh: ack, will look in a bit12:21
jaosoriorderekh: I have no clue honestly. But does this have anything to do? https://review.openstack.org/#/c/303850/12:21
derekhjaosorior: it shouldn't, thats the config of how nodpool talks to our cloud, at the moment we're not evening getting that far12:23
jaosorioroh crap, alright :/12:23
openstackgerritDavid Sariel proposed openstack/tripleo-heat-templates: Enable cinder-backup service start  https://review.openstack.org/30456312:24
*** aufi has quit IRC12:25
*** coolsvap is now known as coolsvap|away12:25
trownshardy: mind putting a PTL stamp of approval on https://review.openstack.org/#/c/304145/12:29
*** ramishra_ has quit IRC12:30
*** jpena is now known as jpena|lunch12:30
bandinitrown: can you cc me on the quickstart IPA change?12:32
openstackgerritCarlos Camacho proposed openstack-infra/tripleo-ci: Removing previously created resourses in pingtest  https://review.openstack.org/30456012:32
*** ramishra has joined #tripleo12:33
trownbandini: actually would you mind filing a wishlist bug for it? https://bugs.launchpad.net/tripleo-quickstart12:33
trownI need to sync over all the open issues from the redhat-openstack github still12:34
bandinitrown: ack sure12:34
*** ramishra_ has joined #tripleo12:35
shardytrown: done12:37
trownshardy: thanks!12:37
*** ramishra_ has quit IRC12:38
bandinitrown: https://bugs.launchpad.net/tripleo-quickstart/+bug/156932212:40
openstackLaunchpad bug 1569322 in tripleo-quickstart "RFE: support IPA ramdisk images from master" [Undecided,New]12:40
*** ramishra_ has joined #tripleo12:40
trownbandini: thanks!12:41
*** ramishra_ has quit IRC13:16
dprincederekh: yeah. for some reason I recall rebuilding them all13:17
dprincederekh: as being the fix for the ARP flood issue13:17
dprincederekh: just the computes13:17
derekhdprince: iirc, yup we did, the reboot seems to have stoped the flooding, but obvously something is still wrong13:18
derekhdprince: we can rebuild and see what happens13:18
dprincederekh: so you say you get IPs, but they don't work?13:18
derekhdprince: if the problem isn't obvious13:18
derekhdprince: the instances are booting but not getting an IP from dhcp13:19
*** ramishra_ has joined #tripleo13:20
*** aparnav has quit IRC13:21
*** tiswanso has joined #tripleo13:22
*** lblanchard has joined #tripleo13:25
*** ramishra_ has quit IRC13:25
dprincederekh: can I clean out all of the shutoff instances from nodepool?13:28
*** morazi has joined #tripleo13:28
derekhdprince: I got no issue with it, as long as nodepool will be ok with it13:29
dprincederekh: it will recover13:29
derekhdprince: ack13:29
dprincederekh: unlikely they would come back up anyways... especially if we resort to rebuilding the computes13:29
derekhdprince: yup13:29
openstackgerritRyan Hallisey proposed openstack/tripleo-heat-templates: Composable Keystone Containers  https://review.openstack.org/30428213:30
openstackgerritRyan Hallisey proposed openstack/tripleo-heat-templates: composable neutron dhcp service  https://review.openstack.org/30338613:30
openstackgerritRyan Hallisey proposed openstack/tripleo-heat-templates: composable neutron metadata service  https://review.openstack.org/30361813:30
openstackgerritRyan Hallisey proposed openstack/tripleo-heat-templates: composable neutron l3 service  https://review.openstack.org/30356213:30
openstackgerritRyan Hallisey proposed openstack/tripleo-heat-templates: composable glance services  https://review.openstack.org/23737013:30
openstackgerritRyan Hallisey proposed openstack/tripleo-heat-templates: Add GlanceRegistry to the endpoint map  https://review.openstack.org/30372813:30
*** trozet has joined #tripleo13:31
*** links has joined #tripleo13:36
derekhdprince: my latest attempt got an IP, I restart nova-compute and neutron-ovs-agent on the compute, not sure if it made a difference13:42
derekhdprince: did you cahnge anything ?13:42
derekhdprince: floating ip works also, gonna restart them on all compute node and see if it helps13:43
dprincederekh: I deleted all the nodepool instacnces13:43
dprincederekh: and then restarted the neutron DHCP and OVS processes on the controller13:43
dprincederekh: just a hunch13:43
derekhdprince: I've also restarted them a few times but maybe you tickled something I didn't13:44
derekhdprince: floating ip works also, gonna restart them on all compute node and see if it helps13:45
dprincederekh: cool13:45
dprincederekh: we might go on and clean out the floatingips' too13:45
dprincederekh: just to clean house... nodepool should fix itself13:46
*** dtrainor has joined #tripleo13:46
derekhdprince: yup13:46
*** akshai has joined #tripleo13:48
dprincederekh: looks like only 2 are assigned. I'll delete the rest of them13:49
dprincederekh: I'm slighly concerned one of the computes is still broken. and if something got scheduled on it it might hose us again13:49
*** ramishra_ has quit IRC13:49
dprincederekh: if that happens I suppose we'll find out soon enough13:50
derekhdprince: not sure what you mean?13:50
dprincederekh: just that the ARP floodding we resume once an instance gets spawned on one of (the broken) compute nodes13:50
*** ramishra_ has joined #tripleo13:50
derekhdprince: which broken compute node?13:51
*** sanjay__u has joined #tripleo13:51
dprincederekh: I don't know which one :)13:51
derekhdprince: ahhh, you think one might be broken13:52
dprincederekh: or how to check it even. Just that there might be one of them that caused this... and as soon as it gets an instance spawned on it we'd be back where we started13:52
dprincederekh: okay, floatingip's cleaned up13:52
dprincederekh: my instance was pinging fine13:52
openstackgerritCarlos Camacho proposed openstack-infra/tripleo-ci: Removing previously created resources in pingtest  https://review.openstack.org/30456013:53
dprincederekh: if connectivity is working for you too we might try opening up to nodepool again...13:53
derekhdprince: wait13:53
dprincederekh: I won't touch that. I was going to let you kick the tires...13:54
derekhdprince: rebooting the proxy, gearman and mirror servers so see if they get an IP tis time13:54
dprincederekh: yep, gotcha13:54
derekhdprince: once their confirmed and testenvs are gonnecting to gearman we can open back up to nodepool13:54
dprincederekh: I think they will recover13:55
dprinceonce you reboot them13:55
derekhdprince: mirror server up, had to start httpd13:56
*** ramishra_ has joined #tripleo13:56
derekhdprince: the other 2 not up yet13:57
shardyHey folks, the meeting starts in 2 mins in #openstack-meeting-alt13:58
slaglethere's a new sheriff in town14:00
derekhdprince: those 3 servers are back now, will open up the iptables rule in a couple of minutes once I make sure testenvs are registering ok14:05
*** ramishra_ has quit IRC14:06
dprincederekh: sounds good14:06
*** Ryjedo has joined #tripleo14:07
*** ramishra_ has joined #tripleo14:07
*** links has quit IRC14:08
*** ramishra_ has quit IRC14:11
*** ramishra_ has joined #tripleo14:12
*** ramishra_ has quit IRC14:15
openstackgerritCarlos Camacho proposed openstack-infra/tripleo-ci: Removing previously created resources in pingtest and --skip-pingtest-cleanup option  https://review.openstack.org/30456014:16
*** ramishra_ has joined #tripleo14:17
derekhdprince: the testenvs dont appear to be connecting to geard, gonna rebuild them now14:17
openstackgerritCarlos Camacho proposed openstack-infra/tripleo-ci: Removing previously created resources in pingtest and --skip-pingtest-cleanup option  https://review.openstack.org/30456014:17
openstackgerritFlorian Fuchs proposed openstack/tripleo-ui: Adds a progress status for the current deployment  https://review.openstack.org/30343614:18
dprincederekh: :/. maybe they lost a port in there somewhere?14:18
dprincederekh: might be worth cleaning up all the ports if we are rebuilding those too....14:19
dprincederekh: ports on the 192. network that is14:19
derekhdprince: perhapes, will be doing that too14:19
dprincederekh: cool14:19
*** liverpooler has joined #tripleo14:25
*** ramishra_ has quit IRC14:28
*** ramishra_ has joined #tripleo14:28
gfidentemichchap dprince hey I was following your steps with the puppet restructuring for ceph14:33
gfidenteI think have the puppet split ready14:33
gfidentethough I figured I want to reuse ceph_osd profile for the ceph storage node14:34
gfidenteexcept it has understanding of 'steps' which we don't use on non-controller nodes14:34
gfidentehave you faced something like this before and have ideas?14:34
michchapgfidente: oh neat I was just talking to someone else on the opnfv team about doing the ceph profiles, since we have another change we wanted to get merged that would depend onit14:34
gfidenteoh I can submit those14:35
michchapthere's a bunch of options ranging from really hacky to mildly hacky14:35
openstackgerritGiulio Fidente proposed openstack/puppet-tripleo: Add ceph profiles  https://review.openstack.org/30467514:35
*** radez has joined #tripleo14:35
gfidenteI have the tht change ready too, except for the cephstorage nodes14:36
michchapgfidente: radez is driving it from the opnfv side14:36
gfidentebecause of the issue with step14:36
radezbeen working on the hyper converged stuff14:37
gfidenteradez ack, I'm getting there14:37
*** liverpooler has quit IRC14:37
radezbut dprince suggested not merging that since the composable stuff is coming14:37
gfidenteI'm working on the puppet manifests split first14:37
gfidenteyep I read that and commented14:37
radezah, gotcha, hadn't seen your comment yet14:38
michchapgfidente: as far as steps, the issue is the OSD fails when it starts before the MON right?14:38
gfidentewe do have control over that in heat14:38
michchapgfidente: and without step on non-controllers it's difficult to do that14:38
gfidentethe problem is the osd profile needs 'steps' to be successfully applied on a controller14:39
gfidentebut we don't have 'steps' on the cephstorage nodes14:39
michchapis there a flag that indicates that a node is a controller?14:39
michchapif (!hiera('is_controller') or $step >= N)14:40
*** rbrady has quit IRC14:41
*** rbrady has joined #tripleo14:41
*** rbrady has quit IRC14:41
gfidenteI was thinking about checking if step is undef14:42
gfidentebut I wanted dprince feedback here too on the steps story on non-controllers14:43
gfidentejistr marios ^^14:43
michchapis puppet only run a single time on non-controllers?14:43
gfidentecurrently yes14:44
dprincegfidente: we will use steps on all the roles14:44
jistrcurrently yes14:44
dprincegfidente: that is the way things truely become composable14:44
gfidentedprince right14:44
dprincegfidente: We are de-composing the controller first14:44
gfidenteand steps will be identical on all role-types then14:44
dprincegfidente: once that is done we can move towards making the other roles support adding in the functionality14:44
dprincegfidente: yes, that is the plan14:44
gfidenteok I see it now, thanks14:45
dprincegfidente: we may in fact be able to share the same base template or something. I haven't modelled that yet.14:45
*** rbrady has joined #tripleo14:46
*** ramishra_ has quit IRC14:47
*** apetrich has quit IRC14:47
*** tiswanso has quit IRC14:48
*** ramishra_ has joined #tripleo14:48
derekhdprince: ok, gonna open up the flood gates14:48
*** tiswanso has joined #tripleo14:48
*** liverpooler has joined #tripleo14:50
*** apetrich has joined #tripleo14:52
*** apetrich has quit IRC14:53
*** ramishra_ has quit IRC14:53
*** apetrich has joined #tripleo14:53
*** paramite is now known as paramite|afk14:54
*** florianf has quit IRC14:55
openstackgerritGiulio Fidente proposed openstack/puppet-tripleo: Add ceph profiles  https://review.openstack.org/30467514:57
*** aufi has quit IRC14:58
*** derekh changes topic to "TripleO | CI cloud is back, currently processing the backlog | CI status: http://tripleo.org/cistatus.html | Docs: http://tripleo.org/"14:58
*** jprovazn has joined #tripleo15:01
*** bvandenh has joined #tripleo15:01
beaglesshardy: was a bit to slow at the end of the meeting typing this in... I ran into a snag last week because I had to recreate my virt environment and for some reason went with the online docs, not the tripleo script15:02
beaglesshardy, I ran into what looks like puppet-concat related issues, this one in the puppet-swift modules.15:02
shardybeagles: Yeah, that's something we need to address - either via everyone (including CI) using tripleo-quickstart15:02
beaglesshardy, k...15:02
shardyor by generating the docs from the CI scriptt (or vice-versa)15:02
beaglesshardy, that's what I wanted to get and idea of - what is the preferred approach to resolving15:03
beaglesshardy, or rather, whether we want to implement the docs (docs take precedence) or document what we actually prefer to do ;)15:04
*** ramishra_ has joined #tripleo15:04
*** coolsvap|away is now known as coolsvap15:05
beaglesshardy, the other thing that was interesting was the reason I had to do this in the first place.. my undercloud VM's filesystem got corrupted. I'm not sure what the root cause was, but I suspect the cache settings for the VM when it was created15:05
beaglesshardy, just a heads up that this might be something that needs attention. I'm looking into whether it is a plausible cause15:06
beaglesdprince, regarding composable patches - I presume you welcome review feedback? ;)15:07
beaglesdprince, or is it early days at the moment? I'm referring specifically to neutron-ish patches15:07
dprincebeagles: sure, jump in and comment on those reviews...15:08
beaglesdprince, awesome15:08
*** afazekas has quit IRC15:09
*** ramishra_ has joined #tripleo15:09
*** afazekas has joined #tripleo15:09
openstackgerritJason Dunsmore proposed openstack/os-collect-config: Convert collectors option to a ListOpt  https://review.openstack.org/30468715:13
openstackgerritGiulio Fidente proposed openstack/puppet-tripleo: Add ceph profiles  https://review.openstack.org/30467515:13
openstackgerritJason Dunsmore proposed openstack/os-collect-config: Convert collectors option to a ListOpt  https://review.openstack.org/30468715:13
openstackgerritGiulio Fidente proposed openstack/puppet-tripleo: Add ceph profiles  https://review.openstack.org/30467515:17
openstackgerritGiulio Fidente proposed openstack/tripleo-heat-templates: Use ceph profiles  https://review.openstack.org/30469215:17
*** yamahata has quit IRC15:19
*** ayoung has joined #tripleo15:20
openstackgerritGiulio Fidente proposed openstack/tripleo-heat-templates: Use ceph profiles  https://review.openstack.org/30469215:21
gfidentemichchap it misses the hiera call because I'm passing it as class param from https://review.openstack.org/30469215:23
gfidenteso it's a bit of a cleanup15:24
michchapgfidente: oh nice15:25
michchapgfidente: in that case, it probably needs to be the first param - usually required params go above optional.15:25
gfidentemichchap ok15:25
*** paramite|afk has quit IRC15:25
*** oshvartz has quit IRC15:26
*** ramishra_ has quit IRC15:26
*** ramishra_ has joined #tripleo15:32
openstackgerritJiri Stransky proposed openstack/tripleo-heat-templates: Make sure openstack services are dependant on openstack-core  https://review.openstack.org/30459215:35
gfidentejistr so the .sh counterpart15:35
*** ramishra_ has quit IRC15:35
gfidenteI think the .sh without the constraints should go to mitaka15:36
slagledprince: i think that is all that is needed to test stack-update's actually ^^15:36
dprinceslagle: nice, lets see what happens :)15:37
gfidenteslagle that is to test the actual tht submission on update right?15:37
slaglegfidente: yes15:37
shardydprince: Hey, was wondering on your thoughts re container integration with the new ResourceChain composition model15:37
rhalliseyshardy dprince how would the use specifcy what service will land on the controller15:37
jistrgfidente: yea, i'm not exactly sure when the restart stopped working, but i think the constraint fix should probably happen with the migration to openstack-core & keystone WSGI, so yeah that means a mitaka backport15:37
shardyrhallisey is looking into it, and it seems the needed flow is somewhat backwards to the current model15:38
dprincerhallisey: there is a parameter15:38
jistrgfidente: or actually... depends how we'll decide to do the migrations15:38
dprincerhallisey: ControllerServices15:38
dprincerhallisey: I expect we would add this parameter for all roles15:38
shardye.g atm we have ControllerServiceChain assemble all the configs which are passed into ControllerNodesPostDeployment15:38
openstackgerritBen Nemec proposed openstack/os-net-config: Normalize operstate value for interfaces  https://review.openstack.org/30471215:38
gfidentejistr yeah ... I was trying to split submissions in a way which allows us to backport only those relevant to mitaka15:38
gfidentebuilding on top those which will cope instead with gnocchi15:38
shardydprince: I was thinking it'd be good if e.g docker/services/keystone.yaml actually deployed the service vs just creating the config15:39
dprincerhallisey: once we decompose the controller I expect we can probably refactor all the roles to use a shared -post.yaml template15:39
jistrgfidente: anyway, i'll add it to pacemaker_migrations.sh and i'll make it idempotent, then we can call the function from wherever we want (even the restart.sh if need be)15:39
openstackgerritJames Slagle proposed openstack-infra/tripleo-ci: Actually test stack-update to new tht  https://review.openstack.org/30470715:39
rhalliseydprince, right it's listed then, but do users edit overcloud.yaml?15:39
shardybecause that isolation is one of the advantages vs the puppet model15:39
dprinceshardy: that is the idea15:39
jistrgfidente: i'm now working on reproducing the issue to gather sosreports for bandini, after that i'll do the .sh15:39
dprinceshardy: oh, wait. Are you talking about passing OS::Nova::Server in there again?15:39
gfidentejistr sure I was worrying which of keystone/wsgi, aodh, gnocchi land in mitaka and which in newton15:40
shardydprince: Yes I'm wonderingif we have to15:40
rhalliseyso the interface is OS::TripleO::Services::Keystone -> I want keystone15:40
shardye.g for the docker case15:40
dprinceshardy: passing in the resource group would be okay, perhaps15:40
jistrnot sure if gnocchi, but aodh and keystone/wsgi land in mitaka15:40
gfidentejistr to have a better understanding of what the migration scripts are meant to deal with15:40
dprinceshardy: I actually don't think we have to do it that way though15:40
dprinceshardy: if we have a better interface15:40
shardydprince: Ok, I guess I'm not clear how we get from the ResourceChain to deploying all the containers15:41
dprinceshardy: I had expected we'd tackle this after decomposing the controller so I haven't prototyped it entirely15:41
rhalliseyshardy, right now I run the existing roles then do containers in post15:42
shardydprince: Yup, cool - I think rhallisey is trying to prototype it now hence throwing some ideas around :)15:42
dprinceshardy: but I do have a pretty good idea15:42
*** tiswanso has quit IRC15:42
dprinceshardy: yep, I'm aware of it15:42
*** dustins has quit IRC15:42
*** tiswanso has joined #tripleo15:43
rhalliseyshardy, I was thinking of having some like a ServiceList: keystone,neutron ...15:43
rhalliseywould there be a way to map those to a resource15:44
*** tiswanso has joined #tripleo15:44
rhalliseyvs tripleo::docker::services::keystone15:44
dprincerhallisey: so let me ask this. Do we want the docker interface refined before we go any further on the compible roles stuff15:44
*** ccamacho has quit IRC15:45
rhalliseydprince, no, you can continue. I just absorb what you have15:45
dprincerhallisey: well, I'm basically asking the core team not to land any features until the controller is done15:45
rhalliseyI use the keystone role as is and plug it into the heat-docker-agents container15:45
dprinceshardy: I think the abstractions can evolve with regards to where we call them15:46
*** ramishra_ has joined #tripleo15:46
dprinceshardy: we aren't committing to an interface yet15:46
rhalliseyshardy, I think it does work15:46
dprincerhallisey: it can work, but I do think we'll refine this a bit further too15:47
rhalliseydprince, I do eveythign in post15:47
dprincerhallisey: yes, that is good I think15:47
dprincerhallisey: but we can't mention Puppet in overcloud.yaml15:47
rhalliseydprince, so I don't interact much with what you have15:47
rhalliseyI just consume it15:47
dprincerhallisey: so where we create the resource chain might need to change I think....15:47
dprincerhallisey: I can fix that15:47
rhalliseydprince, my current patch uses the puppet resource change. I'm changing it15:48
dprincerhallisey: we can fix that :)15:48
rhalliseydprince, you can leave it as is. I'm having a container resource chain in post15:48
*** ccamacho has joined #tripleo15:48
rhalliseyI use your resource chain to figure out what puppet modules I need15:48
dprincerhallisey: so we'll want to watch how many chains we create I think15:48
dprincerhallisey: specifically because it could effect the output of 'heat stack-validate' or something15:49
rhalliseyright.. so this would be 2n15:49
shardyYeah, I was thinking it could be much simpler if we just passed the Controllers into ControllerServiceChain, then deploy the containers for each service15:49
shardyand do nothing at all in Post for the container case15:49
dprinceshardy: that means each services can add anything it wants to Heat software configs15:50
dprinceshardy: which means a free-for-all15:50
dprinceshardy: i.e. not much of an interface15:50
*** ramishra_ has quit IRC15:50
shardydprince: So you're proposing we pass out say the container image and the service config?15:51
dprinceshardy: I would expect the docker services to extend the puppet ones15:51
shardythen deploy all the containers in Post?15:51
dprinceshardy: they would 'extend' it like I do the base services for pacemaker15:51
dprinceshardy: http://git.openstack.org/cgit/openstack/tripleo-heat-templates/tree/puppet/services/pacemaker/keystone.yaml#n1815:52
dprinceshardy: that is where the 'config_settings' come from15:52
dprinceshardy: then the docker services would have extra output parameters for the:15:52
dprinceshardy: 1) docker container15:52
shardydprince: Yeah I commented on the neutron patch about that15:52
dprince2) docker compose section15:52
shardyit's fine, but we'll have a lot of layers of nesting just to abstract extending the interface15:52
rhalliseydprince, I was thinking a per service output15:53
dprince3) puppet tags to apply (this is how we generate just configs w/ puppet)15:53
shardywhich is proving really expensive from a heat perspective15:53
rhalliseydprince, ovs_config ovs_container15:53
dprinceand then in the -post.yaml template we can re-combine those or run them separately as we see fit in an organized fashion15:53
rhalliseydprince, keystone_config keystone_container15:53
dprinceshardy: 2 layers of nesting15:53
shardydprince: sure, I guess it's the re-combining I'm not clear on in the container case15:53
dprinceshardy: and do keep in mind tht these stacks are global, they aren't created for each server...15:54
shardybecause we don't recombine anything, we keep it separate and launch a bunch of containers15:54
shardydprince: yeah that will help limit things somewhat15:54
dprinceshardy: we can run things separate (per service) for docker15:54
*** jcoufal has quit IRC15:54
shardyIn the pacemaker case, assuming say 25 services, it could still be an extra 50 stacks though15:54
dprinceshardy: regardless of how we manage the resource chain15:54
shardythe EndpointMap unrolling was required due to ~70 IIRC15:55
*** athomas has quit IRC15:55
shardyI'd like to see us improve that inside heat, but it's a known issue atm15:55
*** mgould has quit IRC15:55
dprinceshardy: right, the distinction with these containers is they are nested stacks which are created once and re-used15:56
derekhdprince: shardy So, all the jobs that were queued have failed, squid server didn't have a dns server after the reboot, fixed now, on the bright side the ZUUL queue is now clear...15:56
dprincederekh: way to go. Sounds like you just implemented a 'clear queue' button for us :)15:56
dprincerhallisey: I'm glad to see you prototyping this stuff15:57
* derekh files that under "handy tricks"15:57
shardyderekh: nice ;)15:57
shardydprince: Cool, I'm fine with it for now but looking for optimisations which reduces the stack load a bit15:57
openstackgerritBen Nemec proposed openstack/os-net-config: Add explicit check for no active nics  https://review.openstack.org/30472415:58
shardylike, if all that changes is the step_config, we could potentially select the appropriate config from a map (json parameter)15:58
rhalliseydprince, so you think I should merge into the service.yaml resource chain?15:58
*** ramishra_ has joined #tripleo15:58
rhalliseydprince, versus a container one in post15:58
dprincerhallisey: I think we might should combine them, yes.15:58
*** ifarkas has quit IRC15:58
dprincerhallisey: my plan was to gradually move all of the config stuff out of controller.yaml first15:59
rhalliseymy current patch reflects that. The patch in my local branch doesnt'15:59
dprincerhallisey: once that happens I can move the resource chain into -post.yaml15:59
dprincerhallisey: you are pushing me to do that sooner... :)15:59
*** lucasagomes is now known as lucas-brno15:59
rhalliseyok cool16:00
dprincerhallisey: which is fine, but would require a mega-patch to get the configs out in a massive blob16:00
dprincerhallisey: does that make sense?16:00
rhalliseydprince, yes16:00
rhalliseydprince, I'll take another stab at that patch today16:00
dprincerhallisey: so maybe you duplicate the Chain for now w/ a comment that our plan is to combine them once the roles are decomposed entirely16:00
rhalliseydprince, ok16:01
dprincerhallisey: one thing to check would be the output from a 'heat stack-validate'16:02
*** mkovacik has quit IRC16:02
dprincerhallisey: just to see how your docker service would get exposed to the UI (and eventually the CLI) via its parameters16:02
rhalliseydprince, gotcha.. I haven't gotten my patch completely working yet because I'm passing a string to docker-compose resource vs a json16:02
openstackgerritLars Kellogg-Stedman proposed openstack/tripleo-quickstart: add scripts for performing YAML validation  https://review.openstack.org/30473716:12
openstackgerritMerged openstack/tripleo-quickstart: add ipv4 network-isolation to quickstart for virt deployments  https://review.openstack.org/30303016:12
*** apetrich has quit IRC16:13
*** ramishra_ has quit IRC16:13
*** ramishra_ has joined #tripleo16:14
*** oshvartz has joined #tripleo16:16
*** rbrady has quit IRC16:18
*** rbrady has joined #tripleo16:19
*** shivrao has joined #tripleo16:21
*** shivrao_ has joined #tripleo16:21
dprincebnemec: I just hit the bug you filed pointing to this https://review.openstack.org/#/c/29124316:22
bnemecdprince: Yeah, should be fixed by https://review.openstack.org/#/c/304712/16:23
dprincebnemec: yeah, I would have suggested we kick the tires on the breaking commit before landing it16:23
*** ramishra has quit IRC16:23
dprincebnemec: I think most cores don't realize that os-net-config is somewhat sensative to real baremetal in this code16:24
bnemecdprince: Yeah, I'm not even hitting it on baremetal though.  It's breaking all of my virtual deployments too.16:24
dprincebnemec: oh, well that too :)16:25
dprincebnemec: so wait, how did this land then?16:25
bnemecI'm not sure.16:25
*** shivrao has quit IRC16:25
*** shivrao_ is now known as shivrao16:25
openstackgerritPradeep Kilambi proposed openstack/puppet-tripleo: Add redis profile  https://review.openstack.org/30475416:25
bnemecIt doesn't make sense to me that it passed CI.16:25
shardyhttps://review.openstack.org/#/c/291243 did pass CI16:25
dprincebnemec: was this the we weren't building os-net-config packages problem?16:25
dprinceor perhaps we still aren't!?16:26
bnemecdprince: It's possible, but stuff passed CI after it merged too so I don't think it's actually broken in CI.16:26
openstackgerritJiri Stransky proposed openstack/tripleo-heat-templates: Make sure openstack services are dependent on openstack-core  https://review.openstack.org/30459216:26
*** dmacpher is now known as dmacpher-afk16:26
bnemecUnless the package build was delayed a bunch and it only showed up in the repo yesterday about the time everything else blew up.16:26
openstackgerritBen Nemec proposed openstack/os-net-config: Test breaking change in os-net-config  https://review.openstack.org/30476016:29
* bnemec pushes a broken os-net-config patch just to see if we're testing it properly16:29
dprinceshardy: this fixes it for me https://review.openstack.org/#/c/304712/116:30
dprincebnemec: thanks for pushing this. I've been blocked since yesterday due to the sahara patch, I rebuilt my overcloud image and then I hit this16:30
bnemecdprince: np.  I also failed to deploy a single overcloud yesterday, thanks to this and various other issues.16:31
*** ramishra_ has quit IRC16:33
*** sshnaidm has quit IRC16:33
dprincebnemec: -1 on your second patch though.16:33
dprincebnemec: I think that would break the case where I had no active NICs, but wanted to configure a bridge or something16:34
dprincebnemec: or it could. if that function got called, either now or in the future16:34
shardybnemec: if it broke yesterday doesn't that imply we're not building from the repo, e.g we released 0.2.4 yesterday?16:35
bnemecdprince: Is it even possible to configure a working bridge without an interface?16:35
bnemecshardy: Oh, was there a release?  That could be it.16:36
dprincebnemec:  brctl addbr foo16:36
* bnemec looks at what is installed16:36
jistrmarios: re your question on scrum -- reported the bug https://bugs.launchpad.net/tripleo/+bug/156944416:36
openstackLaunchpad bug 1569444 in tripleo "pacemaker_resource_restart.sh hangs on crm_resource --wait" [High,In progress] - Assigned to Jiří Stránský (jistr)16:36
bnemecshardy: My image has os-net-config-0.2.5-0.20160411183356.2ab73df.el7.centos.noarch16:36
mariosjistr: ack tx16:37
shardybnemec: ack, OK I guess it's just coincidence as that patch merged yesterday also16:37
bnemecdprince: But will it do anything?16:38
bnemecI guess I can just make it a warning in any case.  At least the information is there then.16:38
shardywhich actually was from 2188cf1651648af1900b7bb070f9b1eb3f982c3b just before it merged16:38
bnemecshardy: I pushed a change that should break, so we'll see if CI is testing properly: https://review.openstack.org/#/c/304760/1/os_net_config/objects.py16:38
shardybnemec: sounds good, thanks16:39
*** dmacpher-afk is now known as dmacpher16:40
gfidentebnemec though last release is 0.2.4, so I have no clue what 0.2.5 is for16:43
openstackgerritPradeep Kilambi proposed openstack/puppet-tripleo: Add mongodb profiles  https://review.openstack.org/30478016:43
*** shivrao has quit IRC16:43
bnemecgfidente: I assume it has to do with how pbr computes versions.16:44
bnemecNot that I have any clue what the semantics around that are these days.16:44
*** sambetts is now known as sambetts|afk16:45
trowngfidente: delorean takes its version from `python setup.py --version` which takes version from pbr16:46
*** akshai has quit IRC16:47
*** liverpooler has quit IRC16:47
*** dprince has quit IRC16:47
* trown should not have +A'd that os-net-config patch...16:48
openstackgerritPradeep Kilambi proposed openstack/puppet-tripleo: Add mongodb profiles  https://review.openstack.org/30478016:48
trownI put +1 originally because it looked good, but then saw it had passing CI and 2 +2's so just +2'd to +A16:48
*** akshai has joined #tripleo16:52
*** ramishra has joined #tripleo16:54
*** ramishra has quit IRC16:54
*** Guest15115 has quit IRC16:55
*** trown is now known as trown|lunch16:57
openstackgerritJames Slagle proposed openstack-infra/tripleo-ci: Do Not Merge: Test stack-update  https://review.openstack.org/30478716:59
openstackgerritBen Nemec proposed openstack/os-net-config: Add warning for no active nics  https://review.openstack.org/30472417:00
*** tiswanso has quit IRC17:00
*** manous has joined #tripleo17:01
*** akshai has quit IRC17:03
*** akshai has joined #tripleo17:03
*** ramishra has joined #tripleo17:04
*** ramishra has quit IRC17:06
*** ramishra has joined #tripleo17:09
shardytrown|lunch, larsks: Hey I just raised a few quickstart bugs - they're mostly known issues or things we've previously discussed but I wanted to keep track of them17:13
larsksshardy: thanks!17:14
*** gfidente has quit IRC17:14
larsksshardy: note re: https://bugs.launchpad.net/tripleo-quickstart/+bug/1569472 that there is actually documetnation about that in https://github.com/redhat-openstack/tripleo-quickstart/blob/master/docs/accessing-libvirt.md17:15
openstackLaunchpad bug 1569472 in tripleo-quickstart "VMs not reflected in virt-manager" [Undecided,New]17:15
*** jistr has quit IRC17:15
larsks(although that is for virsh rather than virt-manager, which is a little trickier)17:15
shardyWhen we figure out where the upstream images are coming from it'd be great if we could fully automate everything, so you run quickstart then everything is ready to go17:15
shardylarsks: thanks, I'd not spotted that17:16
shardylarsks: I still think it's confusing for folks running it on their local box, already via an unprivileged account17:16
shardyso it'd be good to consider if there's more we can do to streamline things, or make the "where are my VMs" docs more prominent17:17
shardywhat *really* tripped me up was accidentally running with sudo17:17
larsksThat should pretty much work, actually.  As should getting to an unprivileged account via 'su -'.17:17
larsksBut yeah, we can make the docs more obvious.17:17
shardythen you see the VMs, but running without sudo they are invisible17:17
*** ramishra has quit IRC17:17
shardylarsks: I've just been running via my normal shardy account17:18
shardyfirst attempt I made a mistake and did sudo quickstart.sh localhost17:18
shardythat creates the VMs, visible in virt-manager, but then fails to boot the undercloud17:18
larsksshardy: Actually, if you want to update the bug with the specific scenarios you tried that didn't work as expected, that would be a useful reference to have handy.17:19
shardyrunning it again works fine without the sudo, but as an ex instack-virt-setup user, it's confusing17:19
*** qasims has joined #tripleo17:19
shardylarsks: done17:23
*** manous has quit IRC17:24
shardysome of this is completely user error, but I'm trying to highlight pitfalls other folks may encounter :)17:24
openstackgerritEthan Gafford proposed openstack/python-tripleoclient: Trove integration  https://review.openstack.org/23324117:25
openstackgerritEthan Gafford proposed openstack/tripleo-heat-templates: Trove Integration  https://review.openstack.org/23324017:25
*** jpena is now known as jpena|off17:28
*** ramishra has joined #tripleo17:29
larsksshardy: highlighting pitfalls is extremely useful!17:29
*** sshnaidm has joined #tripleo17:29
*** davidlenwell has quit IRC17:29
*** davidlenwell has joined #tripleo17:36
*** ramishra has quit IRC17:37
*** dujelly has joined #tripleo17:42
*** ramishra has joined #tripleo17:43
*** jaosorior has joined #tripleo17:45
*** cwolferh has joined #tripleo17:46
*** shardy has quit IRC17:47
*** ramishra has quit IRC17:48
*** ramishra has joined #tripleo17:49
openstackgerritPradeep Kilambi proposed openstack/puppet-tripleo: Add redis profile  https://review.openstack.org/30475417:56
*** trown|lunch is now known as trown17:57
*** manous_ has joined #tripleo17:58
*** shivrao has joined #tripleo17:59
*** ramishra has quit IRC17:59
*** jaosorior has quit IRC17:59
*** ramishra has joined #tripleo18:00
bnemechttps://review.openstack.org/#/c/304802/ fixes the duplicate sections problem in oslo.config for me.18:05
trownbnemec: nice, I will give that a go.18:06
bnemecIt's a little ugly, but I got 100% unit test coverage of the new code so I'm reasonably confident it's correct.18:07
bnemecIt also looks like we are at least testing the changes in os-net-config: http://logs.openstack.org/60/304760/1/check-tripleo/gate-tripleo-ci-f22-ha/cc7b9a0/console.html#_2016-04-12_16_58_38_20218:08
bnemecI suppose it's possible we aren't installing it in the overcloud image for some reason though.18:09
*** ramishra has quit IRC18:09
*** coolsvap has quit IRC18:10
trownit is also possible (though a bit weirder) that operstate is upper case in our CI env18:10
trownoh... but then CI would not be broken18:10
*** ramishra has joined #tripleo18:10
*** dujelly has quit IRC18:12
*** ramishra has quit IRC18:16
*** ramishra has joined #tripleo18:17
*** ramishra has quit IRC18:18
*** ramishra has joined #tripleo18:23
openstackgerritBen Nemec proposed openstack/os-net-config: Nothing to see here  https://review.openstack.org/30476018:25
bnemecI think I owe derek royalties now or something. :-)18:25
*** ramishra has quit IRC18:28
*** ramishra has joined #tripleo18:28
openstackgerritwes hayutin proposed openstack/tripleo-quickstart: ignore errors on virsh net-undefine in libvirt cleanup  https://review.openstack.org/30481018:29
bnemecCI networking looks to be in a bad way again: http://logstash.openstack.org/#dashboard/file/logstash.json?query=build_name%3A%20*tripleo-ci*%20AND%20build_status%3A%20FAILURE%20AND%20(message%3A%20%5C%22Could%20not%20resolve%20host%3A%20github.com%5C%22%20OR%20message%3A%20%5C%22fatal%3A%20The%20remote%20end%20hung%20up%20unexpectedly%5C%22)18:30
bnemecEvery job I've looked at has died on a network failure of some sort.18:30
slaglebnemec: any object to just merging this? https://review.openstack.org/#/c/30471218:37
slaglei'm seeing CI failures where os-net-config finds no nics18:38
*** ramishra has quit IRC18:39
openstackgerritMerged openstack/tripleo-quickstart: Add ignore error on virsh destroy  https://review.openstack.org/30428618:39
bnemecslagle: It's somewhat concerning to me that that patch failed and has no controller logs.  Maybe we really aren't testing os-net-config on the overcloud properly.18:39
*** ramishra has joined #tripleo18:39
bnemecIn which case I guess we might as well merge it though.18:40
slagleyea it can't get the logs b/c there are no nics18:40
*** florianf has quit IRC18:41
trownthere are no nics in CI on the patch that is meant to fix there being no nics?18:41
bnemecYeah, that's what concerns me.18:41
slaglebnemec: this is how i'm seeing it fail: http://paste.openstack.org/show/493857/18:42
slaglei guess what's odd is that there should be an eth218:42
bnemecNo logs at all on the ha job where os-net-config is used on all the nodes.18:42
trownbnemec: I bet we are not building it and just pulling from delorean current18:42
bnemecslagle: Yeah, that looks related to this issue.18:42
trownbnemec: and yesterday there were some delorean outages that meant we were on the backup server so it took a while to get the broken package18:42
bnemectrown: We're definitely building it.  My test broken patch blew up on the undercloud.18:42
trownoh right18:43
bnemecBut that doesn't mean it's getting installed in the overcloud image.18:43
slaglelet's just merge18:43
bnemecslagle: Yeah, I'm fine with that.18:43
trownmaybe overcloud image is getting delorean package?18:43
bnemecIt's a tiny change and everyone who's looked at it says it matches their environment.18:43
trownya and we cant break broken18:44
bnemectrown: It's possible.  I don't actually understand how the test stuff is injected into the overcloud image build.  Maybe it isn't. :-/18:44
*** tiswanso has quit IRC18:44
bnemecNow I'm really curious to see the results on https://review.openstack.org/#/c/304760/18:45
bnemecAlthough if it's broken we won't see any useful results. :-(18:45
*** apetrich has joined #tripleo18:46
trownoh ya, cause no logs... though it would be interesting to recheck after we get CI fixed18:46
trownI would put money on that warning not showing up18:47
openstackgerritMerged openstack/os-net-config: Normalize operstate value for interfaces  https://review.openstack.org/30471218:47
*** ramishra has quit IRC18:49
*** ramishra has joined #tripleo18:50
*** apetrich_ has joined #tripleo18:52
*** apetrich has quit IRC18:53
*** ramishra has quit IRC18:54
*** ramishra has joined #tripleo18:54
*** ramishra has quit IRC18:54
*** tiswanso has joined #tripleo19:01
*** dprince has joined #tripleo19:06
*** ramishra has joined #tripleo19:06
openstackgerritBen Nemec proposed openstack/os-net-config: Nothing to see here...probably  https://review.openstack.org/30476019:08
*** ramishra has quit IRC19:11
*** ramishra has joined #tripleo19:13
bnemecBetter logstash query for the networking issues: http://logstash.openstack.org/#dashboard/file/logstash.json?query=build_name%3A%20*tripleo-ci*%20AND%20build_status%3A%20FAILURE%20AND%20(message%3A%20%5C%22Could%20not%20resolve%20host%3A%20github.com%5C%22%20OR%20message%3A%20%5C%22fatal%3A%20The%20remote%20end%20hung%20up%20unexpectedly%5C%22)19:15
bnemecAlthough logstash is still missing a bunch of failures for some reason.19:15
*** ramishra has quit IRC19:21
*** ramishra has joined #tripleo19:22
*** qasims has quit IRC19:25
*** ramishra has quit IRC19:31
*** ramishra has joined #tripleo19:31
*** dustins has quit IRC19:39
*** ramishra has quit IRC19:41
*** dustins has joined #tripleo19:42
*** ramishra has joined #tripleo19:42
*** ramishra has quit IRC19:45
*** ramishra has joined #tripleo19:47
*** ramishra has quit IRC19:56
*** lblanchard has quit IRC19:58
*** apetrich_ has quit IRC20:00
*** tiswanso has quit IRC20:00
*** tiswanso has joined #tripleo20:04
*** ramishra has quit IRC20:11
*** qasims has joined #tripleo20:12
*** ramishra has joined #tripleo20:12
*** ramishra has quit IRC20:24
*** ramishra has joined #tripleo20:24
*** ramishra has joined #tripleo20:29
*** ccamacho has quit IRC20:38
*** ramishra has quit IRC20:41
*** ramishra has joined #tripleo20:42
openstackgerritBen Nemec proposed openstack/tripleo-heat-templates: Explicitly set nova and neutron host on controllers  https://review.openstack.org/30485820:51
openstackgerritJohn Trowbridge proposed openstack/tripleo-quickstart: Switch to using standalone role for image building  https://review.openstack.org/30486020:56
*** ramishra has quit IRC20:56
openstackgerritJohn Trowbridge proposed openstack/tripleo-quickstart: Switch to using standalone role for image building  https://review.openstack.org/30486020:59
openstackgerritBen Nemec proposed openstack-infra/tripleo-ci: Add final success/failure message to ping test  https://review.openstack.org/30486421:04
*** ramishra has joined #tripleo21:05
*** MaxPC has quit IRC21:07
openstackgerritJohn Trowbridge proposed openstack/tripleo-quickstart: Switch to using standalone role for image building  https://review.openstack.org/30486021:07
*** julim has quit IRC21:11
*** qasims has quit IRC21:11
*** trown is now known as trown|outtypewww21:11
*** ramishra has quit IRC21:17
*** ramishra has joined #tripleo21:20
*** oshvartz has quit IRC21:36
*** ramishra has quit IRC21:42
*** ayoung has quit IRC21:46
*** ramishra has joined #tripleo21:48
*** oshvartz has joined #tripleo21:50
*** fragatina has quit IRC21:54
*** tiswanso has quit IRC21:55
*** ramishra has quit IRC21:57
*** ramishra has joined #tripleo21:58
*** akshai has quit IRC22:07
*** morazi has quit IRC22:11
*** ramishra has quit IRC22:19
*** ramishra has joined #tripleo22:20
*** fragatina has joined #tripleo22:24
*** ramishra has quit IRC22:25
derekhslagle: bnemec just incase yer looking into it at the moment, looks like the net problems, don't seem to be with our openstack deployment22:27
derekhI've been trying to locate where packets are being dropped, and it can be reproduced fairly easily from the bastion22:28
*** ramishra has joined #tripleo22:28
derekhafter running a bunch of these [derekh@host01-rack01 ~]$ host -v -t A git.openstack.org
derekhabout 1 in 10 or so timout ;; connection timed out; trying next origin22:29
derekhtpcdump shows the UDP packet going out but nothing coming back22:29
derekhI'll ping the lab guys in the morning22:30
derekhslagle: bnemec ^22:30
*** fragatina has quit IRC22:30
*** ayoung has joined #tripleo22:31
*** Marga_ has joined #tripleo22:32
*** ramishra has quit IRC22:33
*** derekh has quit IRC22:33
*** ramishra has joined #tripleo22:34
*** ramishra has quit IRC22:35
*** Marga_ has quit IRC22:35
*** ramishra has joined #tripleo22:39
*** ebalduf_ has quit IRC22:47
*** ramishra has quit IRC22:52
*** ramishra has joined #tripleo22:53
*** sanjay__u has quit IRC22:55
*** ramishra has quit IRC23:07
*** ramishra has joined #tripleo23:08
openstackgerritOpenStack Proposal Bot proposed openstack/tripleo-common: Updated from global requirements  https://review.openstack.org/30062923:15
*** akshai has joined #tripleo23:16
*** yuanying has joined #tripleo23:18
*** ramishra has quit IRC23:19
*** ramishra has joined #tripleo23:20
*** akshai has quit IRC23:21
*** ramishra has quit IRC23:24
*** akshai_ has quit IRC23:30
*** yuanying has quit IRC23:36
*** ramishra has quit IRC23:59

